= Test data compression :toc: :icons: :linkattrs: :imagesdir: ./../resources/images == Summary This section will test data compression of *Amazon FSx for Lustre*. == Duration NOTE: It will take approximately 15 minutes to complete this section. == Step-by-step Guide === Compress existing data Files are uncompressed if they were created when data compression was turned off on the Amazon FSx for Lustre file system. Turning on data compression will not automatically compress your existing uncompressed data. You can use the lfs_migrate command that is installed as a part of the Lustre client installation to compress existing files. IMPORTANT: Read through all steps below. TIP: If you have the two (2) session windows open from the previous section, just use the existing session windows. Do not open two (2) more session windows. . Open two (2) *SSH terminal* or *EC2 Instance Connect* session windows connected to *Linux Instance*. . Start `*nload*` in one of the SSH terminal session windows. + [source,bash] ---- nload -u M ---- + . Complete these steps from the other *SSH terminal* or *EC2 Instance Connect* session window connected to *Linux Instance*. + *_Copy_* and *_execute_* the scripts below one the other *SSH terminal* or *EC2 Instance Connect* session. + . *_Verify_* the file exists. + [source,bash] ---- lfs find /fsx/ior/ior.bin.00000000 ---- + . *_Find_* the lustre stripe configuration of the file. + [source,bash] ---- lfs getstripe /fsx/ior/ior.bin.00000000 ---- + . What is the logical size of the file? *_Run_* du --apparent-size -sh against the file to find the logical size of the file. + [source,bash] ---- time du --apparent-size -sh /fsx/ior/ior.bin.00000000 ---- + My results are below. The file has a logical size of 32 GiB. + ---- 32G /fsx/ior/ior.bin.00000000 ---- + . What is the physical size of the file? *_Run_* du -sh against the file to find the physical size of the file. + [source,bash] ---- time du -sh /fsx/ior/ior.bin.00000000 ---- + My results are below. The file has a physical size of 32 GiB. + ---- 32G /fsx/ior/ior.bin.00000000 ---- + . Migrate the ior.bin.00000000 file. *_Run_* lfs migrate against the file to migrate the file. lfs migrate will re-write the file in place. TIP: While the *lfs migrate* command is running, switch to the *SSH terminal* or *EC2 Instance Connect* session window running *nload* and actively monitor the performance of the *lfs migrate* operation. What is happening on the instance and file system? + [source,bash] ---- lfs migrate /fsx/ior/ior.bin.00000000 ---- + . What is the uncompressed or logical size of the file? *_Run_* du --apparent-size -sh against the file to find the uncompressed or logical size of the file. + [source,bash] ---- time du --apparent-size -sh /fsx/ior/ior.bin.00000000 ---- + My results are below. The file has a logical size of 32 GiB. + ---- 32G /fsx/ior/ior.bin.00000000 ---- + . What is the compressed or physical size of the file? *_Run_* du -sh against the file to find the compressed or physical size of the file. + [source,bash] ---- time du -sh /fsx/ior/ior.bin.00000000 ---- + My results are below. The file has a physical size of 18 GiB. + ---- 9.1G /fsx/ior/ior.bin.00000000 ---- + . *_Calculate_* the data compression ratio of the file. + The data compression ratio is the ratio between the uncompressed size and the compressed size. Take the uncompressed or logical size of the file and divide it by the compressed or physical size of the file. + My calculation is below. The file has a data compression ratio of 3.52 to 1. + ---- 32G ÷ 9.1G = 3.52:1 ---- + . *_Calculate_* the space savings for the file. + Space savings is the reduction in size relative to the uncompressed size. Take the compressed or physical size of the file and divide it by the uncompressed or logical size of the file and subtract that value from 1. This is typically shared as a percentage. + My calculation is below. The file has a space savings of 71.56%. + ---- 1 - (9.1G/32G) = .7156 or 71.56% ---- === Compress new data IMPORTANT: Read through all steps below. TIP: If you have the two (2) session windows open from the previous section, just use the existing session windows. Do not open two (2) more session windows. . Open two (2) *SSH terminal* or *EC2 Instance Connect* session windows connected to *Linux Instance*. . Start `*nload*` in one of the SSH terminal session windows. + [source,bash] ---- nload -u M ---- + . Complete these steps from the other *SSH terminal* or *EC2 Instance Connect* session window connected to *Linux Instance*. + TIP: Monitor real-time throughput using the *EC2 Instance Connect* or *SSH terminal* session window with `*nload*` running. + . Use ior to generate 32 GiB of data using 1 thread. *_Run_* the mpirun ior scripts below. + TIP: Monitor real-time throughput using the *EC2 Instance Connect* or *SSH terminal* session window with `*nload*` running. + [source,bash] ---- _job_name=ior-compressed _segment_count=32768 _threads=1 _path=/fsx/${_job_name} mkdir -p ${_path} cd /fsx mpirun --npernode ${_threads} --oversubscribe ior --posix.odirect -t 1m -b 1m -s ${_segment_count} -g -v -w -i 1 -F -k -D 0 -o ${_path}/ior.bin ---- + . How long did it take to generate ~32 GiB of data using 1 thread? The time command should return time values similar to these: + [source,bash] ---- threads=1 Results: access bw(MiB/s) IOPS Latency(s) block(KiB) xfer(KiB) open(s) wr/rd(s) close(s) total(s) iter ------ --------- ---- ---------- ---------- --------- -------- -------- -------- -------- ---- Commencing write performance test: write 659.11 659.11 49.72 1024.00 1024.00 0.000367 49.72 0.000351 49.72 0 Max Write: 659.11 MiB/sec (691.12 MB/sec) Summary of all tests: Operation Max(MiB) Min(MiB) Mean(MiB) StdDev Max(OPs) Min(OPs) Mean(OPs) StdDev Mean(s) Stonewall(s) Stonewall(MiB) Test# #Tasks tPN reps fPP reord reordoff reordrand seed segcnt blksiz xsize aggs(MiB) API RefNum write 659.11 659.11 659.11 0.00 659.11 659.11 659.11 0.00 49.71589 NA NA 0 1 1 1 1 0 1 0 0 32768 1048576 1048576 32768.0 POSIX 0 Finished : ---- + . How much read and write throughput was achieved using 1 thread? . Compare these results with the *Test performance* section you completed earlier against the file system when data compression was not enabled. + . What is the uncompressed or logical size of the file? *_Run_* du --apparent-size -sh against the file to find the uncompressed or logical size of the file. + [source,bash] ---- time du --apparent-size -sh /fsx/ior-compressed/ior.bin.00000000 ---- + My results are below. The file has a logical size of 32 GiB. + ---- 32G /fsx/ior-compressed/ior.bin.00000000 ---- + . What is the compressed or physical size of the file? *_Run_* du -sh against the file to find the compressed or physical size of the file. + [source,bash] ---- time du -sh /fsx/ior-compressed/ior.bin.00000000 ---- + My results are below. The file has a physical size of 9.0 GiB. + ---- 9.0G /fsx/ior-compressed/ior.bin.00000000 ---- + . *_Calculate_* the data compression ratio of the file. + The data compression ratio is the ratio between the uncompressed size and the compressed size. Take the uncompressed or logical size of the file and divide it by the compressed or physical size of the file. + My calculation is below. The file has a data compression ratio of 3.56 to 1. + ---- 32G ÷ 9.0G = 3.56:1 ---- + . *_Calculate_* the space savings for the file. + Space savings is the reduction in size relative to the uncompressed size. Take the compressed or physical size of the file and divide it by the uncompressed or logical size of the file and subtract that value from 1. This is typically shared as a percentage. + My calculation is below. The file has a space savings of 71.88%. + ---- 1 - (9.0GG/32G) = .7188 or 71.88% ---- == Next section Click the button below to go to the next section. image::monitor-performance.jpg[link=../07-monitor-performance/, align="left",width=420]