## Multi-GPU and distributed training using Horovod in Amazon SageMaker Pipe mode This is a tutorial on how to run multi-GPU training on a single instance on [Amazon SageMaker](https://aws.amazon.com/sagemaker/), and then will move to efficient multi-GPU and multi-node distributed training on Amazon SageMaker. This example has several training examples with different configurations as follows: - Training Tensorflow/Keras on local machine - Running a training job on separate training instance(s) with File Mode input - Running a training job on separate training instance(s) with Pipe Mode input - Running a distributed training job with Horovod with File Mode input - Running a distributed training job with Horovod with Pipe Mode input This example extends this Amazon SageMaker example: https://github.com/awslabs/amazon-sagemaker-examples/blob/master/sagemaker-python-sdk/keras_script_mode_pipe_mode_horovod/tensorflow_keras_CIFAR10.ipynb ## Security See [CONTRIBUTING](CONTRIBUTING.md#security-issue-notifications) for more information. ## License This project is licensed under the Apache-2.0 License.