NTM Hyperparameters

Parameter Name	Description
`feature_dim`	The vocabulary size of the dataset. Required Valid values: Positive integer (min: 1, max: 1,000,000)
num_topics	The number of required topics. Required Valid values: Positive integer (min: 2, max: 1000)
batch_norm	Whether to use batch normalization during training. Optional Valid values: true or false Default value: false
clip_gradient	The maximum magnitude for each gradient component. Optional Valid values: Float (min: 1e-3) Default value: Infinity
encoder_layers	The number of layers in the encoder and the output size of each layer. When set to auto, the algorithm uses two layers of sizes 3 x `num_topics` and 2 x `num_topics` respectively. Optional Valid values: Comma-separated list of positive integers or auto Default value: auto
encoder_layers_activation	The activation function to use in the encoder layers. Optional Valid values: [See the AWS documentation website for more details] Default value: `sigmoid`
epochs	The maximum number of passes over the training data. Optional Valid values: Positive integer (min: 1) Default value: 50
learning_rate	The learning rate for the optimizer. Optional Valid values: Float (min: 1e-6, max: 1.0) Default value: 0.001
mini_batch_size	The number of examples in each mini batch. Optional Valid values: Positive integer (min: 1, max: 10000) Default value: 256
num_patience_epochs	The number of successive epochs over which early stopping criterion is evaluated. Early stopping is triggered when the change in the loss function drops below the specified `tolerance` within the last `num_patience_epochs` number of epochs. To disable early stopping, set `num_patience_epochs` to a value larger than `epochs`. Optional Valid values: Positive integer (min: 1) Default value: 3
optimizer	The optimizer to use for training. Optional Valid values: [See the AWS documentation website for more details] Default value: `adadelta`
rescale_gradient	The rescale factor for gradient. Optional Valid values: float (min: 1e-3, max: 1.0) Default value: 1.0
sub_sample	The fraction of the training data to sample for training per epoch. Optional Valid values: Float (min: 0.0, max: 1.0) Default value: 1.0
tolerance	The maximum relative change in the loss function. Early stopping is triggered when change in the loss function drops below this value within the last `num_patience_epochs` number of epochs. Optional Valid values: Float (min: 1e-6, max: 0.1) Default value: 0.001
weight_decay	The weight decay coefficient. Adds L2 regularization. Optional Valid values: Float (min: 0.0, max: 1.0) Default value: 0.0