# Training scripts for common backbones used in Object Detection
  
The script trains on ImageNet dataset from scratch. Fine-tuning is not supported at the moment. Trained backbones can be used in the object detection library under `detection` directory.


## Instructions to train

### Data Setup

Download ImageNet data and prepare TF Records according to the script.
Please see script [here](https://github.com/aws-samples/deep-learning-models/blob/master/legacy/utils/tensorflow/preprocess_imagenet.py)

### Docker Image

Prepare a docker image for training. A sample Dockerfile is available [here](https://github.com/aws-samples/deep-learning-models/blob/master/models/vision/detection/docker/Dockerfile.ec2)

### EC2 training

Inside the docker container

```
# Train a HRNet_W32C classifier

$ mpirun -np 8 -H localhost:8 -map-by slot -x NCCL_DEBUG=INFO -x TF_XLA_FLAGS=--tf_xla_cpu_global_jit -mca btl ^vader -mca btl_tcp_if_exclude tun0,docker0,lo --bind-to none --allow-run-as-root python train_backbone.py --train_data_dir /data/imagenet/tf_records/train/ --validation_data_dir /data/imagenet/tf_records/validation -b 128 --model hrnet_w32c --schedule cosine
```

### SageMaker training

WIP


## Details of training

For most cases we use cosine decay scheduler and train for 120 epochs on ImageNet dataset.

Standard imagenet data augmentation techniques are used, in addition we use [mixup](https://arxiv.org/abs/1710.09412) and label smoothing to achieve improved results.


## Top-1 Imagenet accuracy 


| Num_GPUs x Images_Per_GPU | Instance type | Model | Top-1 Acc | Training Notes |
| ------------------------- | ------------- | ------------: | ------: | ----- |
| (1x8)x128 | P3.16xl | ResNet50V1_b | 76.8 |  7.3 iters/sec |
| (1x8)x128 | P3.16xl | ResNet50V1_d | 77.6 |  6.2 iters/sec|
| (1x8)x128 | P3.16xl | ResNet101V1_b | 78.9 | 4.8 iters/sec |
| (1x8)x128 | P3.16xl | ResNet101V1_d | 79.5 | 4.3 iters/sec |
| (1x8)x128 | P3.16xl | HRNetW32_c | 79.1 | 2.3 iters/sec |
| (1x8)x128 | P3.16xl | DarkNet53 | 77.0 | 6.4 iters/sec, crop dim 256 |