---
title : "c. Examine an EFA enabled instance"
date: 2020-05-12T10:00:58Z
weight : 20
tags : ["tutorial", "EFA", "ec2", "fi_info", "mpi"]
---

In this section, you will learn how to check if EFA is enabled on your cluster. Make sure you're [connected to the cluster](/05-create-cluster/02-connect-cluster.html#ssm-connect) before proceeding.

#### EFA Enabled

To check if an instance supports EFA we can run the **fi_info -p efa** command, this command queries to see if the efa fabric interface is active. If we run this command on the master:

```bash
$ fi_info -p efa
fi_getinfo: -61
```

We'll see a "Not Found", indicated by the `-61` response. This is because the efa interface is not enabled on the master. In order to accelerate our jobs, we'll need to run on the compute instances. In the following sections we'll spin up a compute instance and examine it again with **fi_info**.

#### Allocate a Compute Node

First, you have to connect to a compute nodes. We'll use `salloc` to allocate an instance:

```bash
salloc -N 1
```

#### Check the Node status

Starting up a new node will take about 2 minutes. In the meantime you can check the status of the queue using the command **squeue**. The job will be first marked as creating (*CF* state) because resources are being created.
If you check the **Instances Tab** in parallelcluster ui you should see nodes booting up. When ready the nodes will be added automatically to your SLURM cluster and you will see a **R** running status as below.

```bash
watch squeue
```

```bash
JOBID PARTITION     NAME     USER     ST       TIME  NODES NODELIST(REASON)
   3   compute    interact   ec2-user  R       1:01      1 compute-dy-hpc6a-1
```

Hit **Ctrl-C** to exit `watch squeue`.

#### Check the Compute node status

You can also check the number of nodes available in your cluster using the command **sinfo**. Do not hesitate to refresh it, nodes generally take less than 2 mins to appear. The following example shows one node.

```bash
sinfo
```

```bash
PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
compute*     up   infinite     63  idle~ compute-dy-hpc6a-[2-64]
compute*     up   infinite      1    mix compute-dy-hpc6a-1
```


#### SSH Into the compute nodes from the Master

At this stage your compute nodes is ready and you can connect to it using **ssh**:

```bash
ssh compute-dy-hpc6a-1
```

#### Check EFA

Once you are in, you can use the **fi_info** tool to verify whether EFA is active. The tool also provides details about provider support, the available interfaces, as well to validate the libfabric installation:

```bash
fi_info -p efa
```

The output of **fi_info** should be similar to this below:

```bash
provider: efa
    fabric: EFA-fe80::4b4:caff:fe96:3ba0
    domain: rdmap0s6-rdm
    version: 113.20
    type: FI_EP_RDM
    protocol: FI_PROTO_EFA
provider: efa
    fabric: EFA-fe80::4b4:caff:fe96:3ba0
    domain: rdmap0s6-dgrm
    version: 113.20
    type: FI_EP_DGRAM
    protocol: FI_PROTO_EFA
```

Now you can disconnect from the compute node, just type **exit**.

```
exit
```

Make sure to cancel the job with `scancel [job_id]` so your compute node gets terminated:

```bash
$ squeue
JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
    3   compute interact ec2-user  R       4:14      1 compute-dy-hpc6a-1
$ scancel 3
salloc: Job allocation 3 has been revoked.
Hangup
```

Next, compile and install a simple HPC benchmark.