= Test Performance
:toc:
:icons:
:linkattrs:
:imagesdir: ../../resources/images


== Summary

This section will test the performance of Amazon FSx for Lustre.


== Duration

NOTE: It will take approximately 15 minutes to complete this section.


== Step-by-step Guide

IMPORTANT: Read through all steps below before continuing.

=== Smallfile Read tests

. Open the link:https://console.aws.amazon.com/ec2/home[Amazon EC2] console.
+
NOTE: Make sure you are in the *AWS Region* of your workshop environment. If you need to change the *AWS Region* of the Amazon EC2 console, in the top right corner of the browser window *_click_* the region name next to *Support* and *_click_* the appropriate *AWS Region* from the drop-down menu.
+
TIP: If the Session Manager window has timed out, e.g. the session is unresponsive, close the session window and create a new one. Return to the link:https://console.aws.amazon.com/ec2/home[Amazon EC2] console. *_Click_* the check box next to the instance with the name *Linux Instance 0*. *_Click_* the *Connect* button. Connect using AWS Systems Manager - *_Select_* *Session Manager* tab and *_click_* the *Connect* button to open a session in your browser.  Change the shell back to bash by typing the command *_bash_* and then return.
. Open two (2) Session Manager windows connected to *Linux Instance 0*.
. Start `*nload*` in one of the Session Manager terminal windows.
+
[source,bash]
----
bash
nload -u M

----
+
*_Copy_* the commands below and *_execute_* them in the other Session Manager terminal window.
. Install smallfile. link:https://github.com/distributed-system-analysis/smallfile[smallfile] is a distributed metadata-intensive workload generator for POSIX-like filesystems. It is licensed under Apache License, Version 2.0.
+
[source,bash]
----
bash
cd
git clone https://github.com/bengland2/smallfile.git
----
. Write (create) 316200 files
+
[source,bash]
----
sudo chown ssm-user:ssm-user /fsx
job_name=$(echo $(uuidgen)| grep -o ".\{6\}$")
prefix=$(echo $(uuidgen)| grep -o ".\{6\}$")
path=/fsx/${job_name}
mkdir -p ${path}

threads=32
file_size=64
file_count=10000
operation=create
same_dir=N

python ~/smallfile/smallfile_cli.py \
--operation ${operation} \
--threads ${threads} \
--file-size ${file_size} \
--files ${file_count} \
--same-dir ${same_dir} \
--hash-into-dirs Y \
--prefix ${prefix} \
--dirs-per-dir ${file_count} \
--files-per-dir ${file_count} \
--top ${path}

----
. Read ~316200 files
+
[source,bash]
----
operation=read

python ~/smallfile/smallfile_cli.py \
--operation ${operation} \
--threads ${threads} \
--file-size ${file_size} \
--files ${file_count} \
--same-dir ${same_dir} \
--hash-into-dirs Y \
--prefix ${prefix} \
--dirs-per-dir ${file_count} \
--files-per-dir ${file_count} \
--top ${path}

----
. Stat ~316200 files
+
[source,bash]
----
operation=stat

python ~/smallfile/smallfile_cli.py \
--operation ${operation} \
--threads ${threads} \
--file-size ${file_size} \
--files ${file_count} \
--same-dir ${same_dir} \
--hash-into-dirs Y \
--prefix ${prefix} \
--dirs-per-dir ${file_count} \
--files-per-dir ${file_count} \
--top ${path}

----


=== dd tests

. Use dd to generate data initially with a single thread.
+
[source,bash]
----
job_name=$(echo $(uuidgen)| grep -o ".\{6\}$")
bs=1024K
count=4096
sync=oflag=sync
threads=1

mkdir -p /fsx/${job_name}/{1..128}

time seq 1 ${threads} | parallel --will-cite -j ${threads} dd if=/dev/zero of=/fsx/${job_name}/{}/dd-$(date +%Y%m%d%H%M%S.%3N) bs=${bs} count=${count} ${sync}
----
. Use dd to generate data with two threads doubling the amount of data generated from the single threaded test.  Does it take twice as long to generate twice the amount of data?
+
[source,bash]
----
job_name=$(echo $(uuidgen)| grep -o ".\{6\}$")
bs=1024K
count=4096
sync=oflag=sync
threads=2

mkdir -p /fsx/${job_name}/{1..128}

time seq 1 ${threads} | parallel --will-cite -j ${threads} dd if=/dev/zero of=/fsx/${job_name}/{}/dd-$(date +%Y%m%d%H%M%S.%3N) bs=${bs} count=${count} ${sync}
----
. Use dd to generate data with three threads trippling the amount of data generated from the single threaded test.  Does it take three times as long to generate three times the amount of data?
+
[source,bash]
----
job_name=$(echo $(uuidgen)| grep -o ".\{6\}$")
bs=1024K
count=4096
sync=oflag=sync
threads=3

mkdir -p /fsx/${job_name}/{1..128}

time seq 1 ${threads} | parallel --will-cite -j ${threads} dd if=/dev/zero of=/fsx/${job_name}/{}/dd-$(date +%Y%m%d%H%M%S.%3N) bs=${bs} count=${count} ${sync}
----

=== ior tests

. Use ior to generate data
+
[source,bash]
----
mkdir -p /fsx/ior
time seq 1 2 | parallel --will-cite -j2 'ior -b 32g -t 8m -w -r -k -F --posix.odirect -o /fsx/ior/ior.bin.{}'
----

== Next section

Click the button below to go to the next section.

image::enable-data-compression.jpg[link=../05-enable-data-compression/, align="left",width=420]