# Overview

CloudWatch can be used to improve the observability of your DRS deployments:

### Dashboards:  
  * You can create a CloudWatch dashboard in every AWS account and region where the DRS service is used and contains source servers to monitor the DRS servers in that account.
  * You can create a cross-account CloudWatch dashboard that consolidates the metrics for each configured DRS AWS account and region in a multi-account deployment into one.

### CloudWatch Logs
  * You can deploy the CloudWatch agent to stream the AWS DRS Agent logs to CloudWatch Logs for search, alarming, and troubleshooting.
  * You can deploy the CloudWatch agent to collect process level metrics for the AWS DRS agent, including the write throughput so you can estimate the bandwidth required for replication.
  * You can create CloudWatch Metric Filters to track warnings and errors generated by the DRS replication agent and written to its log file.

## CloudWatch Dashboards

CloudWatch Metric Insights queries are used to chart the **LagDuration** and **Backlog** metrics per DRS Source Server.  

A DRS CloudWatch dashboard should be deployed in each AWS account and region where DRS is used:  [drs-cloudwatch-dashboard-per-account.yaml](./drs-cloudwatch-dashboard-per-account.yaml).  

A DRS CloudWatch dashboard should be deployed in a centralized logging / monitoring account that consolidates metrics for each account into one dashboard: [drs-cloudwatch-dashboard-cross-account.yaml](./drs-cloudwatch-dashboard-cross-account.yaml)

## AWS EventBridge DRS Notifications
[DRS provides EventBridge notifications for failed recoveries and stalled agents](https://docs.aws.amazon.com/drs/latest/userguide/monitoring-event-bridge-sample.html).  You can use EventBridge to notify you when these events occur.

Templates are provided to capture failed recoveries and stalled agents per DRS account and centrally consolidate them for integrated notification in a central account:

* [drs-eventbridge-per-account.yaml](./drs-eventbridge-per-account.yaml):  Deploy this template in each of your configured DRS accounts in your AWS Organization.   
* [drs-eventbridge-central-account.yaml](./drs-eventbridge-central-account.yaml):  Deploy this template in a centralized account you are using for DRS monitoring and logging.


#### AWS EventBridge DRS Notifications - Central Account
Update your local environment credentials to your centralized/logging DRS account.

Update the template parameters to reflect the appropriate values.

Run the following command to deploy the Central Account EventBridge Rules and SNS Topic
```shell
aws cloudformation create-stack --stack-name drs-amazon-eventbridge-central-account \
--template-body file://drs-eventbridge-central-account.yaml \
--parameters ParameterKey=AwsOrganizationId,ParameterValue="<AWS Organization ID that AWS DRS Accounts are a member of>"\
ParameterKey=DrsSnsSubscriptionEmailAddress,ParameterValue="<Email Address to subscibe to AWS DRS EventBridge Notifications from SNS Topic>"\
ParameterKey=DrsEventBusName,ParameterValue="<Enter Name for Custom AWS EventBridge Event Bus to be created>" \
--region <enter your aws region id, e.g. "us-east-1">
```

#### AWS EventBridge DRS Notifications - Per Account
Run the following command to deploy the AWS EventBridge Rules in each active AWS DRS account and region.  Update your local environment credentials for each account before executing: 

```shell
aws cloudformation create-stack --stack-name drs-amazon-eventbridge-per-account \
--template-body file://drs-eventbridge-per-account.yaml \
--parameters ParameterKey=CentralAccountId,ParameterValue="<Enter the AWS Account ID For the Central Logging/Monitoring Account for AWS DRS>" \
ParameterKey=DrsNotificationEventBus,ParameterValue="<Enter the AWS EventBridge Custom Bus name created in Central Account>" \
ParameterKey=DrsEventRuleShowStalledOnlyOrAll,ParameterValue="<Enter Either StalledOnly or All to filter Agent stall status changes to all changes or send only on agent stalling>" \
ParameterKey=DrsStalledAgentRule,ParameterValue="<Enter ENABLED or DISABLED to toggle notifications for AWS DRS Agent Stall status changes>" \
ParameterKey=DrsFailedRecoveryRule,ParameterValue="<Enter ENABLED or DISABLED to toggle notifications for AWS DRS Recovery Instance Failure>" \
--region <enter your aws region id, e.g. "us-east-1">
```


## CloudWatch Logs

The DRS Replication Agent installed on each source server produce log files in JSON format.  You can ingest these log files into CloudWatch to ensure the agent is running correctly.  

You will need to install the CloudWatch agent on each source server where the DRS agent is installed.  Follow the [instructions in the documentation](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/install-CloudWatch-Agent-on-EC2-Instance.html). 


### Deployment
A CloudFormation template is provided to deploy the dashbaords.  You can use the AWS CLI to perform the deployment.

* Apply local environment credentials for the account

#### Per Account Dashboard
Run the following command to deploy the DRS dashboard in each active AWS DRS account and region.  Update your local environment credentials for each account before executing: 
```shell
aws cloudformation create-stack --stack-name drs-amazon-cloudwatch-dashboard \
--template-body file://drs-cloudwatch-dashboard-per-account.yaml \
--region <enter the aws region id where DRS is configured, e.g. "us-east-1">
```

You only need to deploy one CloudWatch dashboard per account.  CloudWatch dashboards support metrics from multiple regions.

#### Cross Account Dashboard
Update your local environment credentials to your centralized / logging DRS account.  

Update the template parameters and dashboard widget to reflect the DRS accounts you have setup.  The example is defined with two DRS accounts.

You must also configure [cross-account functionality](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/Cross-Account-Cross-Region.html#enable-cross-account-cross-Region) in Cloudwatch to support cross-account dashboards.   

Run the following command to deploy the cross-account dashboard:
```shell
aws cloudformation create-stack --stack-name drs-amazon-cloudwatch-crossaccount-dashboard \
--template-body file://drs-cloudwatch-dashboard-cross-account.yaml \
--parameters ParameterKey=AccountId1,ParameterValue="<AWS Account ID where DRS is configured>" \
ParameterKey=AccountId2,ParameterValue="<AWS Account ID where DRS is configured>" \
--region <enter your aws region id, e.g. "us-east-1">
```

Refer to [CloudWatch cross-account cross-region dashboards](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/cloudwatch_xaxr_dashboard.html) for more details.

#### CloudWatch Logs

1. Install the CloudWatch agent on each of your servers where the DRS agent is running.  Follow the [installation instructions](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/install-CloudWatch-Agent-on-EC2-Instance.html) in the documentation.

2. Copy the configuration file to the CloudWatch agent configuration directory:
* For Linux, copy the [cloudwatch_agent_linux.json](./cloudwatch_agent_linux.json) file to ```/opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.d```.
* For Windows, copy the [cloudwatch_agent_windows.json](./cloudwatch_agent_windows.json) to ```C:\ProgramData\Amazon\AmazonCloudWatchAgent\Configs```

3. Ensure that the configuration file you have included doesn't overlap with another metric / log specification.  Merge the existing files appropriately.
4.  If the instance is running Linux, run the [log_permissions.sh](./log_permissions.sh) script to add the ```cwagent``` user to the ```aws-replication cwagent``` linux group.  This script also updates the permissions on the log file to make it readable by the ```aws-replication cwagent``` group.
5.  Restart the CloudWatch agent to ingest the new configuration file.

##### CloudWatch Metric Filter

You can create a CloudWatch Metric Filter to track warnings and errors generated by the DRS replication agent and posted to the DRS replication agent log file.

The [cloudwatch_logs_metric_filters.yaml](./cloudwatch_logs_metric_filters.yaml) CloudFormation template creates the metrics:  ```DrsReplicationAgentErrors``` and ```DrsReplicationAgentWarnings```.  Each of these metrics counts the numbers of errors and warnings found in the agent log file respectively.

The DRS replication agent writes it log statements in JSON format but currently appends an extraneous ">>>" string to the end of the JSON object.  Therefore, a JSON specific path search can't be used until this is fixed.  The template provided matches based on text string matches for WARNING and ERROR.

After you configure CloudWatch logs described in this repository, you can deploy the metric filters into each AWS account and region where you have EC2 instances with the DRS agent installed:

```shell
aws cloudformation create-stack --stack-name drs-amazon-cloudwatch-metric-filters \
--template-body file://cloudwatch_logs_metric_filters.yaml \
--region <enter your aws region id, e.g. "us-east-1">
```