## Table of Contents
- [Updating the Solution](#updating)
- [Playbook Structure](#playbooks)
- [Creating a Playbook](#creating)
- [Remediation Design - SHARR Runbooks](#remediation-design)
## Notes
### Use of Solution Id in Remediation Runbooks
The intent is to eventually publish remediation runbooks as AWS-owned documents. Therefore, they do not support use of Solution Id or Solution Version in user agent extra on API calls. Also, since we leverage SSM heavily and SSM executeAWSApi doesn't support this field, the use in some places but not others makes little sense.
# Updating the Solution
## Update CDK Version
Ideally, the CDK version should be updated to the latest each time a new solution version is released. Use run-unit-tests.sh to verify that the snapshot still matches *prior to making any changes to the solution.*
1. Update package.json in the solution root: change all *old version* to *new version*
2. Change *required_cdk_version* in **deployment/build-s3-dist.sh** to the new version
3. Run build-s3-dist.sh, then run-unit-tests.sh. Verify that the snapshot matches.
## Building and Deploying Locally
The solution configuration is in deployment/solution_env.sh. You should not need to change this file.
1) Create a global and a regional bucket for testing
ex. mybucket-reference and mybucket-us-east-1 (regional)
2) Check out the source from CodeCommit
3) cd to deployment
4) execute build-s3-dist.sh. -t DEVTEST instructs the build to prefix the Solution Id with "DEV-" to differentiate from production/customer deployments in solution metrics.
```
build-s3-dist.sh -b -v -t DEVTEST
```
5) use upload-s3-dist.sh to upload the build to your buckets (ex. "us-east-1")
```
upload-s3-dist.sh
```
6) Use CloudFormation to deploy from the "reference" bucket (ex. "mybucket-reference")
# Playbook Structure
## SHARR v1.2+ Architecture
* Uses SSM runbooks. Follow prescription for creating runbook structure and storing runbook yaml files.
* Uses same version_description.txt, support.txt, and README.md files as v1.0.
* An "Orchestration" Step Function recieves findings via CloudWatch Event Rules, determines the runbook to execute in the target account, executes and monitors the runbook execution.
## Playbook Folders
|-playbookname/
|---bin/ [ CDK file for template creation ]
|---ssmdocs/ [ Runbook source ]
|-----scripts/ [ Runbook scripts ]
|-------test/ [ Pytest unit tests ]
|---test/ [ CDK test cases ]
|---description.txt [ Description used for the playbook in Service Catalog ]
|---README.md [ Playbook README file ]
|---support.txt [ For Service Catalog ]
|---version_description.txt [ Desribes current playbook version ]
# Creating a New Playbook
## Adding a Playbook
SHARR v1.2 and later use a Step Function in the Admin account to validates inputs, extract finding data, and execute the target remediation using Systems Manager Runbooks in the member account. These are referred to as "SHARR Runbooks."
The logic is simple and extensible: a CloudWatch Event Rule matches findings and sends them to the Orchestrator Step Function. The Step Function extracts the control Id and uses it to derive the target SHARR Runbook algorithmically - there is no mapping, no if-then-else logic, and therefore nothing to add or update to introduce a new remediation.
Security is **Job 0**. SHARR Runbooks must be tightly secured, validate inputs, and have least-privilege access to **Remediation Runbooks**: runbooks that execute the actual remediation actions. This division of security and function allows for sharing of a remediation action for common controls that exist in more than one standard. SHARR Runbooks should not remediate directly, but use Remediation Runbooks.
**Remediation Runbooks** are AWS-owned or SHARR-owned runbooks that perform a single remediation or remediation step for a specific resource. For example, creating a logging bucket, enabling an AWS Service, or setting a parameter on an AWS Service. The permissions to the service APIs are within the definition of the Remediation Runbook; SHARR Runbooks must be allowed to assume the remediation role.
A playbook is a set of remediations within a Security Standard (ex. "CIS", "AFSBP"). Each Playbook has a standard-specific Step Function ("Orchestrator") that "understands" the JSON format of that standard's Finding data. The Orchestrator does the following:
1. Verify the finding data matches the Standard (ex. CIS, PCI, AFSBP)
2. Identify the control id and target account in the JSON data
3. Derive the runbook name (SHARR-\-\-\)
4. Check the status of the runbook in the target account
5. Execute the runbook
6. Monitor until completion
7. Log execution data and send notifications.
### To create a new Playbook
A sample Playbook is provided as a starting point. The estimated time to create a new playbook with a single remediation is 4 or more hours. Each remediation thereafter will take one or more hours, depending upon whether the remediation runbook already exists.
#### Preconditions
* The Security Standard exists and is supported in AWS Security Hub
* The developer has access to AWS Security Hub to generate Finding json for test cases
* The Security Standard has been enabled AWS Security Hub in the account
#### Create the Definitions
1. Get the long name, short name, and version of the security standard from Security Hub by examining a finding for the Security Standard. It is extremely important that these values match the StandardControlArn in the finding data for the Security Standard, except for securityStandardShortName, which is an abbreviation you choose.
Ex. *AWS Foundational Security Best Practices*
* **StandardControlArn**: arn:aws:securityhub:us-east-1:111111111111:control/aws-foundational-security-best-practices/v/1.0.0/RDS.1
* **securityStandardLongName**: aws-foundational-security-best-practices
* **securityStandardShortName**: AFSBP (can be any value you choose)
* **version**: 1.0.0
For the following example, we will create a PCI DSS v3.2.1 Playbook:
* **StandardControlArn**: arn:aws:securityhub:us-east-1:111111111111:control/pci-dss/v/3.2.1/PCI.IAM.6
* **securityStandardLongName**: pci-dss
* **securityStandardShortName**: PCI
* **version**: 3.2.1
2. Copy source/playbooks/NEWPLAYBOOK to a folder whose name is the short name and version (sans '.'s) concatenated. For our example, PCI321:
```
cd source/playbooks
cp -R NEWPLAYBOOK PCI321
cd PCI321
```
3. rename files in **bin** and **lib**.
* bin/newplaybook.ts -> pci321.ts
* lib/newplaybook-member-stack.ts -> pci321-member-stack.ts
* lib/newplaybook-primary-stack.ts -> pci321-primary.stack.ts
* test/newplaybook_stack.test.ts -> pci321_stack.test.ts
4. Edit ./bin/\.ts - ex. afsbp.ts
* Add environmental variables and description.
* Change all "NEWPLAYBOOK" names to something specific to the new playbook. Consistency is key. Include the version if there is a possbility of supported more than one version concurrently (ex. 'PCI321' rather than 'PCI')
* Set the Security Standard parameters for the standard:
```
const standardShortName = 'PCI'
const standardLongName = 'pci-dss'
const standardVersion = '3.2.1' # DO NOT INCLUDE 'V'
```
* In the *remediations* stringlist will go a list of Control Ids for which you are creating remedations. These Control Ids must match the ControlId from the **StandardControlArn** in the Finding data (more on this in **Runbooks**) Ex. **PCI.IAM.6**
> Note: Until the ssmdocs for each control are created the stack will fail to build. You may wish to comment any out that you have not yet created.
##### Example Playbook Definition
```
#!/usr/bin/env node
/*****************************************************************************
* Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. *
* *
* Licensed under the Apache License, Version 2.0 (the "License"). You may *
* not use this file except in compliance with the License. A copy of the *
* License is located at *
* *
* http://www.apache.org/licenses/LICENSE-2.0 *
* *
* or in the 'license' file accompanying this file. This file is distributed *
* on an 'AS IS' BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, *
* express or implied. See the License for the specific language governing *
* permissions and limitations under the License. *
*****************************************************************************/
import 'source-map-support/register';
import * as cdk from '@aws-cdk/core';
import { PlaybookPrimaryStack, PlaybookMemberStack } from '../../../lib/sharrplaybook-construct';
// SOLUTION_* - set by solution_env.sh
const SOLUTION_ID = process.env['SOLUTION_ID'] || 'undefined';
const SOLUTION_NAME = process.env['SOLUTION_NAME'] || 'undefined';
// DIST_* - set by build-s3-dist.sh
const DIST_VERSION = process.env['DIST_VERSION'] || '%%VERSION%%';
const DIST_OUTPUT_BUCKET = process.env['DIST_OUTPUT_BUCKET'] || '%%BUCKET%%';
const DIST_SOLUTION_NAME = process.env['DIST_SOLUTION_NAME'] || '%%SOLUTION%%';
const standardShortName = 'AFSBP'
const standardLongName = 'aws-foundational-security-best-practices'
const standardVersion = '1.0.0' // DO NOT INCLUDE 'V'
const RESOURCE_PREFIX = SOLUTION_ID.replace(/^DEV-/,''); // prefix on every resource name
const app = new cdk.App();
// Creates one rule per control Id. The Step Function determines what document to run based on
// Security Standard and Control Id. See afsbp-member-stack
const remediations = [
'AutoScaling.1',
'CloudTrail.1',
'CloudTrail.2',
'Config.1',
'EC2.1',
'EC2.2',
'EC2.6',
'EC2.7',
'IAM.7',
'IAM.8',
'Lambda.1',
'RDS.1',
'RDS.6',
'RDS.7'
// 'S3.1'
]
const adminStack = new PlaybookPrimaryStack(app, 'AFSBPStack', {
description: `(${SOLUTION_ID}P) ${SOLUTION_NAME} ${standardShortName} ${standardVersion} Compliance Pack - Admin Account, ${DIST_VERSION}`,
solutionId: SOLUTION_ID,
solutionVersion: DIST_VERSION,
solutionDistBucket: DIST_OUTPUT_BUCKET,
solutionDistName: DIST_SOLUTION_NAME,
remediations: remediations,
securityStandardLongName: standardLongName,
securityStandard: standardShortName,
securityStandardVersion: standardVersion
});
const memberStack = new PlaybookMemberStack(app, 'AFSBPMemberStack', {
description: `(${SOLUTION_ID}C) ${SOLUTION_NAME} ${standardShortName} ${standardVersion} Compliance Pack - Member Account, ${DIST_VERSION}`,
solutionId: SOLUTION_ID,
solutionVersion: DIST_VERSION,
solutionDistBucket: DIST_OUTPUT_BUCKET,
securityStandard: standardShortName,
securityStandardVersion: standardVersion,
securityStandardLongName: standardLongName,
remediations: remediations
});
adminStack.templateOptions.templateFormatVersion = "2010-09-09"
memberStack.templateOptions.templateFormatVersion = "2010-09-09"
```
5. Update test/pci321_stack.test.ts
6. Update cdk.json to point to the new bin/\*.ts name
7. ssmdocs/scripts parse script: the example should work for most Standards. Review what it does and make any adjustments.
8. Update the test script for the parse script. Copy finding json for the Security Standard to use in the test. See AFSBP, CIS for examples.
10. Create the ssmdocs for each control in the ssmdocs folder. This is the runbook that is invoked directly by the Orchestrator.
11. Update support.txt, README.md, description.txt
12. Add the Playbook to source/jest.config.js
At this point you should be able to successfully run build-s3-dist.sh from the /deployment folder
### Create Runbooks
1. Using the provided example, create a runbook definition for each remediation
* Runbook json files should be logically named
* The Runbook they produce will be named SHARR-\-\-\
* standard: abbreviation for the standard. It must match the value for *standardShortName* in the CDK that defines the playbook
* version: version of the standard in semver format. Ex. "1.0.0"
* control: must match the value parsed from *StandardsControlArn* in the finding data.
2. If using executeScript your script code goes in ssmdocs/scripts. See **Using Scripts**
3. Create a parse_input script in ssmdocs/scripts for your finding
* bin
* lib
* ssmdocs
* test - typescript tests and data
3. Create the following files
* cdk.json
* description.txt - describes the playbook and appears in Service Catalog
* README.md - readme file for the playbook
* support.txt - support information to appear in Service Catalog
* tsconfig.json
* version_description.txt - describes this playbook version
# Remediation Design - SHARR Runbooks
SHARR Runbooks receive findings from the Orchestrator Step Function. Their role is to parse, validate, and route the finding data to a remediation runbook, and handle the Finding update. A SHARR runbook generally consists of 4 steps:
**ParseInput**: extracts the data required for the remediation from the finding and environment, determines the AffectedObject (subject of the remediation)
**ExecRemediation**: perform the remediation
**VerifyRemediation**: validate the remediation and post results to Output
**UpdateFinding**: Update the finding status
Note that ExecRemediation and VerifyRemediation can be combined in a single step. Be sure to make the output mapping correct for the step name.
## Design Notes
**NO ERROR HANDLING**
Let it fail. The Automation must fail in order to signal the Step Function. If the remediation handles the error, it must then cause the runbook to fail.
**VERIFY Actions**
If calling another runbook to perform the remediation, the parent remediation runbook should use the VerifyRemediation step to check the output returned by the child runbook.
If remediating directly, use judgement. If the API calls for remediation return an assertion of completion with verifiable data, use it. If not, perform active validation of the remediation.
## Inputs
AutomationAssumeRole:
type: String
description: (Required) The ARN of the role that allows Automation to perform the actions on your behalf.
allowedPattern: ^arn:(aws[a-zA-Z-]*)?:iam::\d{12}:role/[\w+=,.@-]+
Finding:
type: StringMap
description: The input from Step function for the finding
## Outputs
- VerifyRemediation.Output
- ParseInput.AffectedObject
Expected output must be in the Output for the Runbook.
## Scripts
When using *executeScript* actions, the source code for the remediation script is stored in playbooks/\/ssmdocs/scripts in a file named \_\_\.py. This allows the build pipeline to scan the script code, run unit tests, and supports syntax highlighting in the developer IDE vs. embedding hard-coded scripts in the SSM Doc's yaml/json definition. In place of the embedded code, place the following: %%SCRIPT=%% as in the example below. This works syntantically with YAML, and although a blit kludgy, supports the required ability to scan, lint, and test the scripts.
```
- name: ParseInput
action: 'aws:executeScript'
outputs:
- Name: IAMUser
Selector: $.Payload.iamuser
Type: String
- Name: FindingId
Selector: $.Payload.finding_id
Type: String
- Name: ProductArn
Selector: $.Payload.product_arn
Type: String
- Name: AffectedObject
Selector: $.Payload.object
Type: StringMap
inputs:
InputPayload:
Finding: '{{Finding}}'
Runtime: python3.7
Handler: parse_event
Script: |-
%%SCRIPT=cis_1.3_parse_input.py%%
isEnd: false
```
During the build-s3-dist.sh process, the CDK script will insert the script as inline code in the template, as there is currently no way to use attachments via CloudFormation without a Lambda-backed custom resource to create the automation document using the API (which does support it).
# Shared Remediations
In SHARR v1.3 we introduced shared remediations - remediation code that is separate from the security controls. SHARR remediation document names start with **SHARR-** and a name describing what the runbook does. Ex. **SHARR-CreateAccessLoggingBucket** creates a bucket for logging access to another bucket.
## Roles
SO0111-
## Using AWSConfigRemediation AWS-Owned Documents
1. Find the document in the console
2. Go to Content tab
3. Copy the source
4. Save *unaltered* to the solution.
Note that these will be replaced with the AWS-Owned version once they are supported in GovCloud and China.
## Add it to the template
source/solution_deploy/lib/remediation_runbook-stack.ts
# Processing Runbook Output
**`check_ssm_execution.py`** is responsible for monitoring SHARR Parse Runbooks until they finish, determining the outcome, collecting and logging metrics and log data.
Runbook output (AutomationExecutionMetadataList Outputs) contains the output from the runbook. Parse runbooks must retrieve this data from child runbooks.
`check_ssm_execution.py` returns an answer json object:
```
{
'status': string,
'remediation_status': string,
'message': string,
'executionid': uuid,
'affected_object': string,
'logdata': serialized json
}
```
**status**
The state of the remediation execution. This is separate from **remediation\_status** - the runbook can succeed but remediation fail. Conversely, if the runbook failed, likely so did the remediation.
**remediation\_status**
The state of the remediation. This is separate from **status** - the runbook can succeed but remediation fail. Conversely, if the runbook failed, likely so did the remediation.
**message**
The message text to be sent in an SNS message and logs.
**executionid**
UUID of the AutomationExecution
**affected\_object**
\ \