# Automation of ML using Step Functions

Below are the steps listed to execute the code once it is downloaded.

## Prerequisite

* Python 3
* Created an AWS account.
* Configured IAM permissions.
* Install [AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html).
* Installed Docker. Note: Docker is only a prerequisite for testing your application locally.
* Installed Homebrew. Note: Homebrew is only a prerequisite for Linux and macOS. 
* Install [AWS SAM](https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/serverless-sam-cli-install.html).Note: Make sure you have version 0.33.0 or later. You can check which version you have by executing the command sam --version

## Step 1 Download Applciation

* Download the application from github repo (**Need to update the path Or install using [SAR](SAR path)**)

## Step 2 Initiate Application

* Go to the path and initiate application. For more details please click on this [link](https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/sam-cli-command-reference-sam-init.html).
  
   **Command to run:**
   ~~~~ aws
   sam init -l /Users/sam-demo/step-pipeline.zip
   ~~~~

## Step 3 Create S3 Bucket and move dependencies

* S3 bucket creation.
  
  **Command to run:**
  
  ~~~~ aws
  aws s3 mb s3://blog-step-pipeline-demo
  ~~~~

* Move data to the bucket created.
  **Command to run:**
  
  ~~~~ aws
  aws s3 cp sample_ml_code/kmeansandey.py s3://<your bucket name>/testcode/kmeansandey.py
  aws s3 cp sample_ml_code/kmeanswsssey.py s3://<your bucket name>/testcode/kmeanswsssey.py
  aws s3 cp emr/bootstrapactions.sh s3://<your bucket name>/emr-bootstrap-scripts/bootstrapactions.sh
  aws s3 cp cemr/emr-cluster-config.json s3://<your bucket name>/emr-cluster-config.json
  aws s3 cp emr/emr-cluster-sample.yaml s3://<your bucket name>/emr-cluster-sample.yaml
  ~~~~

## Step 4 Update samconfig.toml

* Update bucket name, prefix, region, stackname and parameter_overrides path/**samconfig.toml**

    ~~~~aws
    version = 0.1
    [default]
    [default.deploy]
    [default.deploy.parameters]
    stack_name = "step-pipeline"
    s3_bucket = "blog-step-pipeline-demo"
    s3_prefix = "step-pipeline"
    region = "us-east-1"
    confirm_changeset = true
    capabilities = "CAPABILITY_IAM"
    parameter_overrides = "S3Bucket=\"blog-step-pipeline-demo\" SNSEndpoint=\"youremailid\" SNSEndpointType=\"email\""
    ~~~~

## Step 5 Validate Application

* Validate the application. or more details please follow this [link](https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/sam-cli-command-reference-sam-validate.html)
  **Command to run:**

    ~~~~aws
    sam validate
    ~~~~

## Step 6 Build Application

* Build the application. For more details please follow this [link](https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/sam-cli-command-reference-sam-build.html).

    **Command to run:**

    ~~~~aws
    sam build
    ~~~~

## Step 7 Deploy Application

* Deploy Application. For more details please follow this [link](https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/sam-cli-command-reference-sam-deploy.html).
    **Command to run:**

    ~~~~aws
    sam deploy
    ~~~~

    will display the values used for deployment.

    ~~~~aws
    Deploying with following values
	===============================
	Stack name                 : step-pipeline
	Region                     : us-east-1
	Confirm changeset          : True
	Deployment s3 bucket       : blog-step-pipeline-demo
	Capabilities               : ["CAPABILITY_IAM"]
	Parameter overrides        : {'S3Bucket': 'blog-step-pipeline-demo', 'SNSEndpoint': 'youremailid', 'SNSEndpointProtocol': 'email'
    ~~~~

    It will initiate the deployment and upload.

    ~~~~aws
    Initiating deployment
    =====================
    Uploading to step-pipeline/4563ab22057c6195ada5230ff89d4479  7200902 / 7200902.0  (100.00%)
    Uploading to step-pipeline/c7a6fc91b11999f3b1c90dfe05eb8005  7200809 / 7200809.0  (100.00%)
    Uploading to step-pipeline/0927ee565b4ae38859af6b6f9cb721f8  7200628 / 7200628.0  (100.00%)
    Uploading to step-pipeline/e055863aee6da95366b93c06cd84f185  7200758 / 7200758.0  (100.00%)
    Uploading to step-pipeline/d60c001cc0c8a04106a940e38344147e  7204253 / 7204253.0  (100.00%)
    Uploading to step-pipeline/e33924dad68b633c31e08d720a324f99  7200734 / 7200734.0  (100.00%)
    Uploading to step-pipeline/dfd99ba1ae2496f85b8887292ab9fb99  7200707 / 7200707.0  (100.00%)
    Uploading to step-pipeline/b1e626e5230d7094dfdaf49b98370c32  861 / 861.0  (100.00%)
    Uploading to step-pipeline/a94703141e7e92293d32db7559aad1ca  7200660 / 7200660.0  (100.00%)
    Uploading to step-pipeline/3aacadd15742d7e719e130651e82fdaa  7200783 / 7200783.0  (100.00%)
    Uploading to step-pipeline/99137343607be51efba7ad6624feb558  7200688 / 7200688.0  (100.00%)
    Uploading to step-pipeline/b3c6b442d80482459f8b1e2eaa250da7  7200826 / 7200826.0  (100.00%)
    Uploading to step-pipeline/7dd8e4f2511800e09114cd49d1036e6b.template  31417 / 31417.0  (100.00%)

    Waiting for changeset to be created..
    ~~~~

    Next, it will display the CloudFormation stack changeset.

    ~~~~aws
    ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    Operation                                                           LogicalResourceId                                                   ResourceType
    ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    + Add                                                               AddStepLambdaRole                                                   AWS::IAM::Role
    + Add                                                               AddStepLambda                                                       AWS::Lambda::Function
    + Add                                                               AsyncStartStateMachineLambda                                        AWS::Lambda::Function
    + Add                                                               CFNLambdaRole                                                       AWS::IAM::Role
    + Add                                                               CheckClusterLambda                                                  AWS::Lambda::Function
    + Add                                                               CheckStepLambda                                                     AWS::Lambda::Function
    + Add                                                               CloudFormationRole                                                  AWS::IAM::Role
    + Add                                                               CreateCFNStackLambda                                                AWS::Lambda::Function
    + Add                                                               DeleteCFNStackLambda                                                AWS::Lambda::Function
    + Add                                                               DescribeCFNStackLambda                                              AWS::Lambda::Function
    + Add                                                               FailureLambda                                                       AWS::Lambda::Function
    + Add                                                               GetArrayLengthLambda                                                AWS::Lambda::Function
    + Add                                                               GetClusterIdLambda                                                  AWS::Lambda::Function
    + Add                                                               MLPipelineAlertingSNSTopic                                          AWS::SNS::Topic
    + Add                                                               MLStateMachine                                                      AWS::StepFunctions::StateMachine
    + Add                                                               StartStateMachineLambdaRole                                         AWS::IAM::Role
    + Add                                                               StartStateMachineLambda                                             AWS::Lambda::Function
    + Add                                                               StateMachineRole                                                    AWS::IAM::Role
    + Add                                                               StepFunctionsRole                                                   AWS::IAM::Role
    + Add                                                               SubmitStepStateMachine                                              AWS::StepFunctions::StateMachine
    + Add                                                               SuccessFailureLambdaRole                                            AWS::IAM::Role
    + Add                                                               SuccessLambda                                                       AWS::Lambda::Function
    ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

    Changeset created successfully.
    ~~~~

    Next, will ask for deployment confirmation, please enter **Y** if you are agree with the Cloudformation stack changeset, else enter **N** and make the necessary correction.

    ~~~~aws
    Previewing CloudFormation changeset before deployment
    ======================================================
    Deploy this changeset? [y/N]: Y
    ~~~~

    It will update the status of deployment

    ~~~~aws
    2020-04-05 14:30:07 - Waiting for stack create/update to complete

    CloudFormation events from changeset
    ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    ResourceStatus                                     ResourceType                                       LogicalResourceId                                  ResourceStatusReason
    ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    CREATE_IN_PROGRESS                                 AWS::IAM::Role                                     StateMachineRole                                   Resource creation Initiated
    CREATE_IN_PROGRESS                                 AWS::IAM::Role                                     CloudFormationRole                                 Resource creation Initiated
    CREATE_IN_PROGRESS                                 AWS::IAM::Role                                     StartStateMachineLambdaRole                        Resource creation Initiated
    CREATE_IN_PROGRESS                                 AWS::IAM::Role                                     SuccessFailureLambdaRole                           Resource creation Initiated
    CREATE_IN_PROGRESS                                 AWS::SNS::Topic                                    MLPipelineAlertingSNSTopic                         Resource creation Initiated
    CREATE_IN_PROGRESS                                 AWS::IAM::Role                                     AddStepLambdaRole                                  Resource creation Initiated
    CREATE_IN_PROGRESS                                 AWS::IAM::Role                                     StateMachineRole                                   -
    CREATE_IN_PROGRESS                                 AWS::IAM::Role                                     CloudFormationRole                                 -
    CREATE_IN_PROGRESS                                 AWS::IAM::Role                                     SuccessFailureLambdaRole                           -
    CREATE_IN_PROGRESS                                 AWS::IAM::Role                                     StartStateMachineLambdaRole                        -
    CREATE_IN_PROGRESS                                 AWS::SNS::Topic                                    MLPipelineAlertingSNSTopic                         -
    CREATE_IN_PROGRESS                                 AWS::IAM::Role                                     AddStepLambdaRole                                  -
    CREATE_COMPLETE                                    AWS::SNS::Topic                                    MLPipelineAlertingSNSTopic                         -
    CREATE_COMPLETE                                    AWS::IAM::Role                                     SuccessFailureLambdaRole                           -
    CREATE_COMPLETE                                    AWS::IAM::Role                                     AddStepLambdaRole                                  -
    CREATE_COMPLETE                                    AWS::IAM::Role                                     StartStateMachineLambdaRole                        -
    CREATE_COMPLETE                                    AWS::IAM::Role                                     CloudFormationRole                                 -
    CREATE_COMPLETE                                    AWS::IAM::Role                                     StateMachineRole                                   -
    CREATE_IN_PROGRESS                                 AWS::Lambda::Function                              CheckClusterLambda                                 -
    CREATE_IN_PROGRESS                                 AWS::IAM::Role                                     CFNLambdaRole                                      -
    CREATE_IN_PROGRESS                                 AWS::Lambda::Function                              AsyncStartStateMachineLambda                       -
    CREATE_IN_PROGRESS                                 AWS::Lambda::Function                              StartStateMachineLambda                            -
    CREATE_IN_PROGRESS                                 AWS::IAM::Role                                     CFNLambdaRole                                      Resource creation Initiated
    CREATE_IN_PROGRESS                                 AWS::Lambda::Function                              FailureLambda                                      -
    CREATE_IN_PROGRESS                                 AWS::Lambda::Function                              AddStepLambda                                      -
    CREATE_IN_PROGRESS                                 AWS::Lambda::Function                              SuccessLambda                                      -
    CREATE_IN_PROGRESS                                 AWS::Lambda::Function                              CheckStepLambda                                    -
    CREATE_COMPLETE                                    AWS::Lambda::Function                              FailureLambda                                      -
    CREATE_COMPLETE                                    AWS::Lambda::Function                              SuccessLambda                                      -
    CREATE_COMPLETE                                    AWS::Lambda::Function                              AsyncStartStateMachineLambda                       -
    CREATE_IN_PROGRESS                                 AWS::Lambda::Function                              AddStepLambda                                      Resource creation Initiated
    CREATE_IN_PROGRESS                                 AWS::Lambda::Function                              FailureLambda                                      Resource creation Initiated
    CREATE_IN_PROGRESS                                 AWS::Lambda::Function                              CheckClusterLambda                                 Resource creation Initiated
    CREATE_IN_PROGRESS                                 AWS::Lambda::Function                              SuccessLambda                                      Resource creation Initiated
    CREATE_IN_PROGRESS                                 AWS::Lambda::Function                              CheckStepLambda                                    Resource creation Initiated
    CREATE_IN_PROGRESS                                 AWS::Lambda::Function                              AsyncStartStateMachineLambda                       Resource creation Initiated
    CREATE_COMPLETE                                    AWS::Lambda::Function                              StartStateMachineLambda                            -
    CREATE_COMPLETE                                    AWS::Lambda::Function                              CheckClusterLambda                                 -
    CREATE_IN_PROGRESS                                 AWS::Lambda::Function                              StartStateMachineLambda                            Resource creation Initiated
    CREATE_COMPLETE                                    AWS::Lambda::Function                              AddStepLambda                                      -
    CREATE_COMPLETE                                    AWS::Lambda::Function                              CheckStepLambda                                    -
    CREATE_IN_PROGRESS                                 AWS::StepFunctions::StateMachine                   SubmitStepStateMachine                             -
    CREATE_COMPLETE                                    AWS::StepFunctions::StateMachine                   SubmitStepStateMachine                             -
    CREATE_IN_PROGRESS                                 AWS::StepFunctions::StateMachine                   SubmitStepStateMachine                             Resource creation Initiated
    CREATE_COMPLETE                                    AWS::IAM::Role                                     CFNLambdaRole                                      -
    CREATE_IN_PROGRESS                                 AWS::Lambda::Function                              GetArrayLengthLambda                               -
    CREATE_IN_PROGRESS                                 AWS::Lambda::Function                              DeleteCFNStackLambda                               -
    CREATE_IN_PROGRESS                                 AWS::Lambda::Function                              GetClusterIdLambda                                 -
    CREATE_IN_PROGRESS                                 AWS::Lambda::Function                              DeleteCFNStackLambda                               Resource creation Initiated
    CREATE_IN_PROGRESS                                 AWS::Lambda::Function                              GetArrayLengthLambda                               Resource creation Initiated
    CREATE_IN_PROGRESS                                 AWS::Lambda::Function                              CreateCFNStackLambda                               -
    CREATE_IN_PROGRESS                                 AWS::Lambda::Function                              DescribeCFNStackLambda                             -
    CREATE_COMPLETE                                    AWS::Lambda::Function                              DescribeCFNStackLambda                             -
    CREATE_COMPLETE                                    AWS::Lambda::Function                              GetClusterIdLambda                                 -
    CREATE_COMPLETE                                    AWS::Lambda::Function                              DeleteCFNStackLambda                               -
    CREATE_IN_PROGRESS                                 AWS::Lambda::Function                              CreateCFNStackLambda                               Resource creation Initiated
    CREATE_COMPLETE                                    AWS::Lambda::Function                              GetArrayLengthLambda                               -
    CREATE_IN_PROGRESS                                 AWS::Lambda::Function                              GetClusterIdLambda                                 Resource creation Initiated
    CREATE_IN_PROGRESS                                 AWS::Lambda::Function                              DescribeCFNStackLambda                             Resource creation Initiated
    CREATE_COMPLETE                                    AWS::Lambda::Function                              CreateCFNStackLambda                               -
    CREATE_IN_PROGRESS                                 AWS::IAM::Role                                     StepFunctionsRole                                  -
    CREATE_IN_PROGRESS                                 AWS::IAM::Role                                     StepFunctionsRole                                  Resource creation Initiated
    CREATE_COMPLETE                                    AWS::IAM::Role                                     StepFunctionsRole                                  -
    CREATE_IN_PROGRESS                                 AWS::StepFunctions::StateMachine                   MLStateMachine                                     -
    CREATE_COMPLETE                                    AWS::StepFunctions::StateMachine                   MLStateMachine                                     -
    CREATE_IN_PROGRESS                                 AWS::StepFunctions::StateMachine                   MLStateMachine                                     Resource creation Initiated
    CREATE_COMPLETE                                    AWS::CloudFormation::Stack                         step-pipeline                                      -
    ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

    CloudFormation outputs from deployed stack
    ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    Outputs
    ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    Key                 GetArrayLengthLambda
    Description         GetArrayLengthLambda ARN
    Value               arn:aws:lambda:us-east-1:accountno:function:step-pipeline-GetArrayLengthLambda-1T4XSBG98PC9U

    Key                 CheckStepLambda
    Description         CheckStepLambda ARN
    Value               arn:aws:lambda:us-east-1:accountno:function:step-pipeline-CheckStepLambda-RJ52ECLEXAID

    Key                 DescribeCFNStackLambda
    Description         DescribeCFNStackLambda ARN
    Value               arn:aws:lambda:us-east-1:accountno:function:step-pipeline-DescribeCFNStackLambda-QFLGOR605OZH

    Key                 CloudFormationRole
    Description         CloudFormationRole ARN
    Value               arn:aws:iam::accountno:role/step-pipeline-CloudFormationRole-EWLP9ZPWQ0OE

    Key                 AddStepLambda
    Description         AddStepLambda ARN
    Value               arn:aws:lambda:us-east-1:accountno:function:step-pipeline-AddStepLambda-RCJEBTR16LNY

    Key                 StateMachineRole
    Description         StateMachineRole ARN
    Value               arn:aws:iam::accountno:role/step-pipeline-StateMachineRole-1PFEYND3O1V6

    Key                 AsyncStartStateMachineLambda
    Description         AsyncStartStateMachineLambda ARN
    Value               arn:aws:lambda:us-east-1:accountno:function:step-pipeline-AsyncStartStateMachineLambda-YRDJ4C1H6ZU2

    Key                 StepFunctionsRole
    Description         StepFunctionsRole ARN
    Value               arn:aws:iam::accountno:role/step-pipeline-StepFunctionsRole-L91VR6FBYNUU

    Key                 CreateCFNStackLambda
    Description         CreateCFNStackLambda ARN
    Value               arn:aws:lambda:us-east-1:accountno:function:step-pipeline-CreateCFNStackLambda-ASPDKV2R2K9C

    Key                 GetClusterIdLambda
    Description         GetClusterIdLambda ARN
    Value               arn:aws:lambda:us-east-1:accountno:function:step-pipeline-GetClusterIdLambda-1DGHD479X7XP1

    Key                 FailureLambda
    Description         FailureLambda ARN
    Value               arn:aws:lambda:us-east-1:accountno:function:step-pipeline-FailureLambda-V2O9Q61PYHK5

    Key                 CheckClusterLambda
    Description         CheckClusterLambda ARN
    Value               arn:aws:lambda:us-east-1:accountno:function:step-pipeline-CheckClusterLambda-1KBKN9R4RZQJP

    Key                 SuccessLambda
    Description         SuccessLambda ARN
    Value               arn:aws:lambda:us-east-1:accountno:function:step-pipeline-SuccessLambda-1CG3FUU2D8ME

    Key                 DeleteCFNStackLambda
    Description         DeleteCFNStackLambda ARN
    Value               arn:aws:lambda:us-east-1:accountno:function:step-pipeline-DeleteCFNStackLambda-64SSJVO1GS8N

    Key                 StartStateMachineLambda
    Description         StartStateMachineLambda ARN
    Value               arn:aws:lambda:us-east-1:accountno:function:step-pipeline-StartStateMachineLambda-FU1NRX7R3J3N
    ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

    Successfully created/updated stack - step-pipeline in us-east-1
    ~~~~

## Step 8 Confirm SNS Notification

* You will get an email from AWS Notification on the specifed email, Please confirm the SNS subscription.

## Step 9 Test Application

* Update your S3 Bucket, Security Group, Sub Net and information in /events/events.JSON
  
  ~~~~JSON
      {
      "ModelName": "Model-Name",
      "ModelProgram": "s3://<your bucket name>/testcode/kmeansandey.py",
      "PreProcessingProgram": "s3://<your bucket name>/testcode/kmeanswsssey.py",
      "EMRCloudFormation": "https://s3.amazonaws.com/<your bucket name>/emr-cluster-sample.yaml",
      "EMRParameters": "https://s3.amazonaws.com/<your bucket name>/emr-cluster-config.json",
      "JobInput": "s3://aws-bigdata-blog/artifacts/anomaly-detection-using-pyspark/sensorinputsmall/",
      "SecurityGroup": "<your-security-group>",
      "SubNet": "< your-subnet>",
      "ClusterSize":"no-of-clusters",
      "ProcessingMode": ["TRAINING"]
      }
      Note: Do not change the JobInput for Demo
  ~~~~

* Go to the [AWS Step Functions](https://console.aws.amazon.com/states/home?/statemachines) console and copy the arn of function named **MLStateMachine**.

* Test AWS Step Functions by submitting an event

  ~~~~ aws
  aws stepfunctions start-execution --state-machine-arn arn:aws:states:<your region>:<your accountid>:stateMachine:awsblog-MLStateMachine-testproject --name test1 --input file://events/event.json
  ~~~~

* You should now be able to view the step function initiated, the new EMR cluster created and the steps executed. Once the process is completed you will be able to see the output in `<your bucket name>/output/`.


## License

This library is licensed under the MIT-0 License. See the LICENSE file.