= Enable data deduplication :toc: :icons: :linkattrs: :imagesdir: ../resources/images == Summary This section will enable data deduplication. == Duration NOTE: It will take approximately 20 minutes to complete this section. == Step-by-step Guide === Enable data deduplication IMPORTANT: Read through all steps below and watch the quick video before continuing. image::enable-data-dedup.gif[align="left", width=600] . *_Copy_* the script below into your favorite text editor. + [source,bash] ---- $WindowsRemotePowerShellEndpoint = "windows_remote_powershell_endpoint" # e.g. "fs-0123456789abcdef.example.com" enter-pssession -ComputerName ${WindowsRemotePowerShellEndpoint} -ConfigurationName FsxRemoteAdmin ---- + . From the link:https://console.aws.amazon.com/fsx/[Amazon FSx] console, *_click_* the link to the *STG326 - SAZ* file system and *_select_* the *Network & security* tab. *_Copy_* the *Windows Remote PowerShell Endpoint* of the file system to the clipboard (e.g. fs-0123456789abcdef.example.com). . Return to your favorite text editor and replace *"windows_remote_powershell_endpoint"* with the *Windows Remote PowerShell Endpoint* of *STG326 - SAZ*. *_Copy_* the updated script. . Go to the remote desktop session for your *Windows Instance 0*. . *_Click_* *Start* >> *Windows PowerShell*. . *_Run_* the updated script in the *Windows PowerShell* window. + NOTE: Complete the next few steps using the remote PowerShell session to the FSx file server. + . Review the PowerShell function commands for data deduplication available using the *Amazon FSx CLI for remote management on PowerShell*. * *_Run_* the command in the *Remote Windows PowerShell Session*. + [source,bash] ---- Get-Command *-FSxDedup* ---- + . What commands are available? . Enable data depduplication for the entire FSx file system. * *_Run_* the command in the *Remote Windows PowerShell Session*. + [source,bash] ---- Enable-FSxDedup ---- + . Examine your data deduplication environment using the commands in the table below. + |=== | *Command* | Get-FSxDedupConfiguration | Get-FSxDedupStatus | Get-FSxDedupJob | Get-FSxDedupMetadata | Get-FSxDedupSchedule | Measure-FSxDedupFileMetadata -path "D:\share" |=== + * Were all these commands successful? Why not? * When is the next scheduled "Optimization" task? . End the remote PowerShell session. *_Run_* *Exit-PSSession*. . Close the PowerShell window. *_Run_* *exit*. === Create new data deduplication optimization schedule IMPORTANT: Read through all steps below and watch the quick video before continuing. image::new-data-dedup-schedule.gif[align="left", width=600] . *_Copy_* the script below into your favorite text editor. + [source,bash] ---- $WindowsRemotePowerShellEndpoint = "windows_remote_powershell_endpoint" # e.g. "fs-0123456789abcdef.example.com" enter-pssession -ComputerName ${WindowsRemotePowerShellEndpoint} -ConfigurationName FsxRemoteAdmin ---- + . From the link:https://console.aws.amazon.com/fsx/[Amazon FSx] console, *_click_* the link to the *STG326 - SAZ* file system and *_select_* the *Network & security* tab. *_Copy_* the *Windows Remote PowerShell Endpoint* of the file system to the clipboard (e.g. fs-0123456789abcdef.example.com). . Return to your favorite text editor and replace *"windows_remote_powershell_endpoint"* with the *Windows Remote PowerShell Endpoint* of *STG326 - SAZ*. *_Copy_* the updated script. . Go to the remote desktop session for your *Windows Instance 0*. . *_Click_* *Start* >> *Windows PowerShell*. . *_Run_* the updated script in the *Windows PowerShell* window. IMPORTANT: Complete the next few steps using the remote PowerShell session to the FSx file server. . Create a new data deduplication optimization schedule. * *_Run_* the command in the *Remote Windows PowerShell Session*. + [source,bash] ---- New-FSxDedupSchedule ---- + * Use the table values when prompted. + |=== | *Prompt* | *Value* | Name | DailyOptimization | Type | Optimization |=== + . What time will the optimization start? . Examine the different options available to data deduplication jobs. * *_Run_* the command in the *Remote Windows PowerShell Session*. + [source,bash] ---- Set-FSxDedupSchedule -? ---- + . *_Copy_* the command below into your favorite text editor and update the *start_time* parameter with the current time plus 2 minutes. Look at the clock in bottom right corner of the remote desktop window. Add 2 minutes to this time and replace the *start_time* parameter with this value. (i.e. 5:32pm). This time is in UTC. + [source,bash] ---- Set-FSxDedupSchedule -Name DailyOptimization -Start start_time ---- + * Run the updated command in the *Windows PowerShell* window. * Wait for the time of the DailyOptimization scheduled job to pass (i.e. 1 minute after the start_time you entered above) and run the command below to check the status. * *_Run_* the command in the *Remote Windows PowerShell Session*. + [source,bash] ---- Get-FSxDedupStatus ---- + . Did the optimization schedule run? * Look at the LastOptimizationTime value of the Get-FSxDedupStatus output. . How many files were optimized and how much space is saved? * Find the corresponding Get-FSxDedupStatus output for the command attributes in the table below + |=== | *Attribute* | LastOptimizationResult | OptimizedFilesCount | OptimizedFilesSavingsRate | OptimizedFilesSize | SavedSpace |=== + . Do you see any optimization? Why not? . Quickly read the *_Enabling data deduplication_* section of the link:https://docs.aws.amazon.com/fsx/latest/WindowsGuide/using-data-dedup.html[Amazon FSx for Windows File Server User Guide] to find the answer. * *_Run_* the command in the *Remote Windows PowerShell Session*. + [source,bash] ---- Get-FSxDedupConfiguration ---- + . What is the MinimumFileAgeDays attribute value? . Update the data deduplication configuration and set the minimum file age days attribute to 0. * *_Run_* the command in the *Remote Windows PowerShell Session*. + [source,bash] ---- Set-FSxDedupConfiguration -MinimumFileAgeDays 0 ---- + . Update the DailyOptimization data deduplication schedule to run in 2 minutes. . *_Copy_* the command below into your favorite text editor and update the *start_time* parameter with the current time plus 2 minutes. Look at the clock in bottom right corner of the remote desktop window. Add 2 minutes to this time and replace the *start_time* parameter with this value. (i.e. 5:32pm) + [source,bash] ---- Set-FSxDedupSchedule -Name DailyOptimization -Start start_time ---- + * *_Run_* the updated command in the *Remote Windows PowerShell Session*. * Wait for the time of the DailyOptimization scheduled job to pass (i.e. 1 minute after the start_time you entered above) and run the command below to check on the status. * *_Run_* the command in the *Remote Windows PowerShell Session*. + [source,bash] ---- Get-FSxDedupStatus ---- + . Did the optimization schedule run? * Look at the LastOptimizationTime value of the Get-FSxDedupStatus output. . The active data deduplication job may still be running. *_run_* the following command in the *Remote Windows PowerShell Session* to check on the status of the data deduplication job. + [source,bash] ---- Get-FSxDedupJob ---- + . Continue to re-run the Get-FSxDedupJob command every few minutes to check on the status of the job. This may take 5-10 minutes depending on the amount of data you creating during the *test performance* section. . Continue with the tutorial while the data deduplication job runs in the background. . If the Get-FSxDedupJob command returns an error, then there are no more active jobs and the job has completed. . *_Run_* the command in the *Remote Windows PowerShell Session*. + [source,bash] ---- Get-FSxDedupStatus ---- + . How many files were optimized and how much space is saved? * Find the corresponding Get-FSxDedupStatus output for the command attributes in the table below. + |=== | *Attribute* | LastOptimizationResult | OptimizedFilesCount | OptimizedFilesSavingsRate | OptimizedFilesSize | SavedSpace |=== . End the remote PowerShell session. *_Run_* *Exit-PSSession*. . Close the PowerShell window. *_Run_* *exit*. == Next section Click the button below to go to the next section. image::07-enable-shadow-copies.png[link=../07-enable-shadow-copies/, align="left",width=420]