1. Create ns `test`: `kubectl create ns test`
2. Run single node test: `kubectl apply -f debug_job.yaml` and check pod logs.
3. Install MPI-operator: `kubectl apply -f mpi.yaml`
4. Run multi node NCCL-test: change `nodeSelector` and `tolerations` according to your cluster and run `kubectl apply -d nccl_test.yaml`, then check launcher logs.


# Note on results of NCCL tests (4th step):
You chould see logs similar to these:
```
....

new-st-gpu-2:8802:8863 [5] NCCL INFO NET/OFI Using aws-ofi-nccl 1.5.0aws
new-st-gpu-2:8802:8863 [5] NCCL INFO NET/OFI Configuring AWS-specific options

...

new-st-gpu-2:8804:8862 [6] NCCL INFO NET/OFI Selected Provider is efa (found 4 nics)
new-st-gpu-2:8804:8862 [6] NCCL INFO Using network AWS Libfabric
....

# on p4d with RDMA
...
via NET/AWS Libfabric/1/GDRDMA
...
```