Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
SPDX-License-Identifier: CC-BY-SA-4.0

Linear Learner Hyperparameters

The following table contains the hyperparameters for the learner learner algorithm. These are parameters that are set by users to facilitate the estimation of model parameters from data. The required hyperparameters that must be set are listed first, in alphabetical order. The optional hyperparameters that can be set are listed next, also in alphabetical order.

Parameter Name Description
feature_dim The number of features in the input data. Required Valid values: Positive integer
num_classes The number of classes for the response variable. The algorithm assumes that classes are labeled 0, ..., num_classes - 1. Required when predictor_type is multiclass_classifier. Otherwise, the algorithm ignores it. Valid values: Integers from 3 to 1,000,000
predictor_type Specifies the type of target variable as a binary classification, multiclass classification, or regression. Required Valid values: binary_classifier, multiclass_classifier, or regressor
accuracy_top_k When computing the top-k accuracy metric for multiclass classification, the value of k. If the model assigns one of the top-k scores to the true label, an example is scored as correct. Optional Valid values: Positive integers Default value: 3
balance_multiclass_weights Specifies whether to use class weights, which give each class equal importance in the loss function. Used only when the predictor_type is multiclass_classifier. Optional Valid values: true, false Default value: false
beta_1 The exponential decay rate for first-moment estimates. Applies only when the optimizer value is adam. Optional Valid values: auto or floating-point value between 0 and 1.0 Default value: auto
beta_2 The exponential decay rate for second-moment estimates. Applies only when the optimizer value is adam. Optional Valid values: auto or floating-point integer between 0 and 1.0 Default value: auto
bias_lr_mult Allows a different learning rate for the bias term. The actual learning rate for the bias is learning_rate * bias_lr_mult. Optional Valid values: auto or positive floating-point integer Default value: auto
bias_wd_mult Allows different regularization for the bias term. The actual L2 regularization weight for the bias is wd * bias_wd_mult. By default, there is no regularization on the bias term. Optional Valid values: auto or non-negative floating-point integer Default value: auto
binary_classifier_model_selection_criteria When predictor_type is set to binary_classifier, the model evaluation criteria for the validation dataset (or for the training dataset if you don’t provide a validation dataset). Criteria include: [See the AWS documentation website for more details] Optional Valid values: accuracy, f_beta, precision_at_target_recall, recall_at_target_precision, or loss_function Default value: accuracy
early_stopping_patience If no improvement is made in the relevant metric, the number of epochs to wait before ending training. If you have provided a value for binary_classifier_model_selection_criteria. the metric is that value. Otherwise, the metric is the same as the value specified for the loss hyperparameter. The metric is evaluated on the validation data. If you haven’t provided validation data, the metric is always the same as the value specified for the loss hyperparameter and is evaluated on the training data. To disable early stopping, set early_stopping_patience to a value greater than the value specified for epochs.OptionalValid values: Positive integerDefault value: 3
early_stopping_tolerance The relative tolerance to measure an improvement in loss. If the ratio of the improvement in loss divided by the previous best loss is smaller than this value, early stopping considers the improvement to be zero. Optional Valid values: Positive floating-point integer Default value: 0.001
epochs The maximum number of passes over the training data. Optional Valid values: Positive integer Default value: 15
f_beta The value of beta to use when calculating F score metrics for binary or multiclass classification. Also used if the value specified for binary_classifier_model_selection_criteria is f_beta. Optional Valid values: Positive floating-point integers Default value: 1.0
huber_delta The parameter for Huber loss. During training and metric evaluation, compute L2 loss for errors smaller than delta and L1 loss for errors larger than delta. Optional Valid values: Positive floating-point integer Default value: 1.0
init_bias Initial weight for the bias term. Optional Valid values: Floating-point integer Default value: 0
init_method Sets the initial distribution function used for model weights. Functions include: [See the AWS documentation website for more details] Optional Valid values: uniform or normal Default value: uniform
init_scale Scales an initial uniform distribution for model weights. Applies only when the init_method hyperparameter is set to uniform. Optional Valid values: Positive floating-point integer Default value: 0.07
init_sigma The initial standard deviation for the normal distribution. Applies only when the init_method hyperparameter is set to normal. Optional Valid values: Positive floating-point integer Default value: 0.01
l1 The L1 regularization parameter. If you don’t want to use L1 regularization, set the value to 0. Optional Valid values: auto or non-negative float Default value: auto
learning_rate The step size used by the optimizer for parameter updates. Optional Valid values: auto or positive floating-point integer Default value: auto, whose value depends on the optimizer chosen.
loss Specifies the loss function. The available loss functions and their default values depend on the value of predictor_type: [See the AWS documentation website for more details] Valid values: auto, logistic, squared_loss, absolute_loss, hinge_loss, eps_insensitive_squared_loss, eps_insensitive_absolute_loss, quantile_loss, or huber_loss Optional Default value: auto
loss_insensitivity The parameter for the epsilon-insensitive loss type. During training and metric evaluation, any error smaller than this value is considered to be zero. Optional Valid values: Positive floating-point integer Default value: 0.01
lr_scheduler_factor For every lr_scheduler_step hyperparameter, the learning rate decreases by this quantity. Applies only when the use_lr_scheduler hyperparameter is set to true. Optional Valid values: auto or positive floating-point integer between 0 and 1 Default value: auto
lr_scheduler_minimum_lr The learning rate never decreases to a value lower than the value set for lr_scheduler_minimum_lr. Applies only when the use_lr_scheduler hyperparameter is set to true. Optional Valid values: auto or positive floating-point integer Default values: auto
lr_scheduler_step The number of steps between decreases of the learning rate. Applies only when the use_lr_scheduler hyperparameter is set to true. Optional Valid values: auto or positive integer Default value: auto
margin The margin for the hinge_loss function. Optional Valid values: Positive floating-point integer Default value: 1.0
mini_batch_size The number of observations per mini-batch for the data iterator. Optional Valid values: Positive integer Default value: 1000
momentum The momentum of the sgd optimizer. Optional Valid values: auto or a floating-point integer between 0 and 1.0 Default value: auto
normalize_data Normalizes the feature data before training. Data normalization shifts the data for each feature to have a mean of zero and scales it to have unit standard deviation. Optional Valid values: auto, true, or false Default value: true
normalize_label Normalizes the label. Label normalization shifts the label to have a mean of zero and scales it to have unit standard deviation. The auto default value normalizes the label for regression problems but does not for classification problems. If you set the normalize_label hyperparameter to true for classification problems, the algorithm ignores it. Optional Valid values: auto, true, or false Default value: auto
num_calibration_samples The number of observations from the validation dataset to use for model calibration (when finding the best threshold). Optional Valid values: auto or positive integer Default value: auto
num_models The number of models to train in parallel. For the default, auto, the algorithm decides the number of parallel models to train. One model is trained according to the given training parameter (regularization, optimizer, loss), and the rest by close parameters. Optional Valid values: auto or positive integer Default values: auto
num_point_for_scaler The number of data points to use for calculating normalization or unbiasing of terms. Optional Valid values: Positive integer Default value: 10,000
optimizer The optimization algorithm to use. Optional Valid values: [See the AWS documentation website for more details] Default value: auto. The default setting for auto is adam.
positive_example_weight_mult The weight assigned to positive examples when training a binary classifier. The weight of negative examples is fixed at 1. If you want the algorithm to choose a weight so that errors in classifying negative vs. positive examples have equal impact on training loss, specify balanced. If you want the algorithm to choose the weight that optimizes performance, specify auto. Optional Valid values: balanced, auto, or a positive floating-point integer Default value: 1.0
quantile The quantile for quantile loss. For quantile q, the model attempts to produce predictions so that the value of true_label is greater than the prediction with probability q. Optional Valid values: Floating-point integer between 0 and 1 Default value: 0.5
target_precision The target precision. If binary_classifier_model_selection_criteria is recall_at_target_precision, then precision is held at this value while recall is maximized. Optional Valid values: Floating-point integer between 0 and 1.0 Default value: 0.8
target_recall The target recall. If binary_classifier_model_selection_criteria is precision_at_target_recall, then recall is held at this value while precision is maximized. Optional Valid values: Floating-point integer between 0 and 1.0 Default value: 0.8
unbias_data Unbiases the features before training so that the mean is 0. By default. data is unbiased if the use_bias hyperparameter is set to true. Optional Valid values: auto, true, or false Default value: auto
unbias_label Unbiases labels before training so that the mean is 0. Applies to regression only if the use_bias hyperparameter is set to true. Optional Valid values: auto, true, or false Default value: auto
use_bias Specifies whether the model should include a bias term, which is the intercept term in the linear equation. Optional Valid values: true or false Default value: true
use_lr_scheduler Whether to use a scheduler for the learning rate. If you want to use a scheduler, specify true. Optional Valid values: true or false Default value: true
wd The weight decay parameter, also known as the L2 regularization parameter. If you don’t want to use L2 regularization, set the value to 0. Optional Valid values:auto or non-negative floating-point integer Default value: auto