# Lab 2: Simulate Taxi Booking, Billing and Payments using Amazon DynamoDB, DynamoDB Streams, AWS Lambda and Amazon Aurora PostgreSQL - [Overview](#overview) - [Prerequisites](#prerequisites) - [Setup the AWS Cloud 9 Environment](#setup-the-aws-cloud-9-environment) * [Update SAM CLI](#update-sam-cli) * [Install AWS SDK for Python](#install-aws-sdk-for-python) * [Save CloudFormation Stack name as a variable](#save-cloudformation-stack-name-as-a-variable) - [Enable Amazon DynamoDB Streams](#enable-amazon-dynamodb-streams) - [Deploying AWS Lambda Function](#deploying-aws-lambda-function) * [Packaging the PG8000 binaries](#packaging-the-pg8000-binaries) * [Deploy AWS Lambda Function and AWS Lambda Layer using AWS SAM template](#deploy-aws-lambda-function-and-aws-lambda-layer-using-aws-sam-template) - [Taxi Ride Workflow](#taxi-ride-workflow) * [Taxi Trip Booking Workflow](#taxi-trip-booking-workflow) * [Driver Billing and Payments](#driver-billing-and-payments) ## Overview In this lab, you will simulate taxi trip booking by a rider and acceptance by a driver followed by billing and payment using Python scripts and SQL commands. You will utilize DynamoDB streams and AWS lambda functions to insert completed trip data from DynamoDB to Aurora PostgreSQL. ![architecture.png](./assets/architecture.png) ## Prerequisites 1. You should have completed [Lab1](https://github.com/aws-samples/amazon-rds-purpose-built-workshop/tree/master/lab1-TaxiDataMigration) from the [github repository](https://github.com/aws-samples/amazon-rds-purpose-built-workshop). ## Setup the AWS Cloud 9 Environment ### Update SAM CLI 1. Open the AWS Management Console for [AWS Cloud9](https://us-east-1.console.aws.amazon.com/cloud9/home/account). You will leverage AWS Cloud9 IDE throughout this lab for running scripts, deploying AWS SAM (Serverless Application Model) templates, executing SQL queries etc. 2. Click on __Open IDE__ for the AWS Cloud9 IDE that was created as part of the Amazon CloudFormation template that was deployed. 3. Open a terminal window in the AWS Cloud9 IDE by clicking on __Window__ from the menu bar on the top and select __New Terminal__. 4. Update AWS SAM CLI to the latest version by running the following commands in the Cloud9 terminal window. ```shell script cd ~/environment pip install --user --upgrade awscli aws-sam-cli sam --version ``` > Note: Ensure that the SAM CLI version is 0.21.0 or above. ### Install AWS SDK for Python - To install [Boto3](https://boto3.amazonaws.com/v1/documentation/api/latest/index.html?id=docs_gateway) (AWS SDK for Python) copy paste the following commands in the terminal window in the AWS Cloud9 IDE ```shell script cd ~/environment curl -O https://bootstrap.pypa.io/get-pip.py # Get the install script. sudo python3 get-pip.py # Install pip. pip3 install boto3 --user ``` ### Save CloudFormation Stack name as a variable 1. Open the AWS Management Console for CloudFormation from [here](https://us-east-1.console.aws.amazon.com/cloudformation/home?region=us-east-1). 2. In the upper-right corner of the AWS Management Console, confirm you are in the US East (N. Virginia) Region. 3. Click on __Stacks__ in the right navigation pane. 4. Under __Stacks__ copy the parent CloudFormation stack name starting with **mod-** (e.g. _mod-aa8afde9acf04c7f_). 5. Substitute the string (_substitute-name-of-copied-cf-stack-name_) in the command below with the name of the CloudFormation stack and run it in the Cloud9 terminal window. ```shell script AWSDBWORKSHOP_CFSTACK_NAME="substitute-name-of-copied-cf-stack-name" ``` 6. Copy and paste the following commands in the Cloud9 terminal window to set the environment variable _$AWSDBWORKSHOP_CFSTACK_NAME_ . ```shell script echo "export AWSDBWORKSHOP_CFSTACK_NAME=${AWSDBWORKSHOP_CFSTACK_NAME}" >> ~/.bash_profile . ~/.bash_profile echo $AWSDBWORKSHOP_CFSTACK_NAME ``` > Note: Ensure that the name of the CloudFormation stack printed in the terminal window (e.g. _mod-aa8afde9acf04c7f_) matches with the name of your parent CloudFormation stack name. ## Enable Amazon DynamoDB Streams In this section, you will enable Amazon DynamoDB stream for the Amazon DynamoDB table named _'aws-db-workshop-trips'_ that was created as part of the CloudFormation stack. - Copy and paste the commands below in the Cloud9 terminal window to enable streams for the Amazon DynamoDB Tables named _'aws-db-workshop-trips'_ ```shell script STREAM_ID=$(aws dynamodb update-table --table-name aws-db-workshop-trips --stream-specification StreamEnabled=true,StreamViewType=NEW_AND_OLD_IMAGES | jq '.TableDescription.LatestStreamArn' | cut -d'/' -f4) STREAM_NAME=stream/${STREAM_ID::-1} echo "export AWSDBWORKSHOP_DDB_STREAM_NAME=${STREAM_NAME}" >> ~/.bash_profile . ~/.bash_profile echo $AWSDBWORKSHOP_DDB_STREAM_NAME ``` > Note: The output should be similar to the following >``` >stream/2019-09-18T20:18:33.343 >``` Now that you have enabled the Amazon DynamoDB stream, the next step is to deploy the AWS Lambda function that will process records from the stream. ## Deploying AWS Lambda Function In this section, you will be using AWS Serverless Application Model ([SAM]((https://aws.amazon.com/serverless/sam/))) CLI to deploy a Lambda Function within the same Amazon Virtual Private Network([VPC](https://aws.amazon.com/vpc/)). The SAM deployment will also include a Python interface to the PostgreSQL database engine as an AWS Lambda Layer. This Lambda function will read the taxi trip information from DynamoDB streams as they are inserted/updated in the DynamoDB table **_aws-db-workshop-trips_**. Only when a trip is complete (denoted by the _STATUS_ attribute in the trip item/record), the Lambda function will insert information into **_trips_** table in Aurora PostgreSQL. ### Packaging the PG8000 binaries In this section, you will download and package the binaries for [PG8000](https://pypi.org/project/pg8000/) - a Python interface to PostgreSQL. The package will be deployed as an AWS Lambda Layer. - Copy and paste the below commands in the Cloud9 terminal window. ```shell script cd ~/environment mkdir pglayer virtualenv -p python3 pglayer cd pglayer source bin/activate mkdir -p pg8000-layer/python pip install pg8000 -t pg8000-layer/python cd pg8000-layer zip -r pg8000-layer.zip python mkdir ~/environment/amazon-rds-purpose-built-workshop/src/ddb-stream-processor/dependencies/ cp ~/environment/pglayer/pg8000-layer/pg8000-layer.zip ~/environment/amazon-rds-purpose-built-workshop/src/ddb-stream-processor/dependencies/ ``` ### Deploy AWS Lambda Function and AWS Lambda Layer using AWS SAM template In this section, you will validate the SAM template that contains the configuration for the Lambda function and the Lambda Layer. 1. To validate the SAM template, copy and paste the commands below in the terminal window ```shell script cd ~/environment/amazon-rds-purpose-built-workshop/src/ddb-stream-processor sam validate ``` > Note: The terminal window output should display the following: > ``` > /home/ec2-user/environment/amazon-rds-purpose-built-workshop/src/ddb-stream-processor/template.yaml is a valid SAM Template >``` 2. To package the AWS SAM application, copy and paste the commands below in the Cloud9 terminal window. This will create a _template-out.yaml_ file is the same folder and will upload the packaged binaries to the specified Amazon S3 bucket. ```shell script cd ~/environment/amazon-rds-purpose-built-workshop/src/ddb-stream-processor S3_BUCKETNAME=$(aws cloudformation describe-stacks --stack-name $AWSDBWORKSHOP_CFSTACK_NAME | jq -r '.Stacks[].Outputs[] | select(.OutputKey=="S3bucketName") | .OutputValue') echo $S3_BUCKETNAME sam package --output-template-file template-out.yaml --s3-bucket $S3_BUCKETNAME ``` 3. [_Optional_] Please take some time to review _template-out.yaml_. 4. Copy and paste the command below to ensure that the packages have been uploaded successfully to the Amazon S3 bucket. ```shell script aws s3 ls s3://$S3_BUCKETNAME ``` > Sample Output: > ``` > 019-09-15 16:39:56 70451 14b63970e9437bf82ea16664d46a929e > 2019-09-15 16:39:56 71954 d3eec91527b02d78de30ae42198cd0c0 > ``` 5. Set the variables from the output of the Amazon CloudFormation template that was deployed. > Note: You can copy the entire script block below and paste in the Cloud9 terminal window. Press Enter for the final command to execute. ```shell script cd ~/environment/amazon-rds-purpose-built-workshop/src/ddb-stream-processor AURORADB_NAME=$(aws cloudformation describe-stacks --stack-name $AWSDBWORKSHOP_CFSTACK_NAME | jq -r '.Stacks[].Outputs[] | select(.OutputKey=="AuroraDBName") | .OutputValue') echo $AURORADB_NAME AURORACLUSTERENDPOINT_NAME=$(aws cloudformation describe-stacks --stack-name $AWSDBWORKSHOP_CFSTACK_NAME | jq -r '.Stacks[].Outputs[] | select(.OutputKey=="AuroraClusterEndpointName") | .OutputValue') echo $AURORACLUSTERENDPOINT_NAME AURORADBMASTERUSER_NAME=$(aws cloudformation describe-stacks --stack-name $AWSDBWORKSHOP_CFSTACK_NAME | jq -r '.Stacks[].Outputs[] | select(.OutputKey=="AuroraDBMasterUser") | .OutputValue') echo $AURORADBMASTERUSER_NAME LAMBDASECURITYGROUP_ID=$(aws cloudformation describe-stacks --stack-name $AWSDBWORKSHOP_CFSTACK_NAME | jq -r '.Stacks[].Outputs[] | select(.OutputKey=="LambdaSecurityGroupId") | .OutputValue') echo $LAMBDASECURITYGROUP_ID LAMBDASUBNET1_ID=$(aws cloudformation describe-stacks --stack-name $AWSDBWORKSHOP_CFSTACK_NAME | jq -r '.Stacks[].Outputs[] | select(.OutputKey=="LambdaSubnet1") | .OutputValue') LAMBDASUBNET2_ID=$(aws cloudformation describe-stacks --stack-name $AWSDBWORKSHOP_CFSTACK_NAME | jq -r '.Stacks[].Outputs[] | select(.OutputKey=="LambdaSubnet2") | .OutputValue') echo $LAMBDASUBNET1_ID,$LAMBDASUBNET2_ID ``` 6. Copy and paste the following command to deploy the Lambda Function along with the Lambda Layer. ```shell script sam deploy --template-file template-out.yaml --capabilities CAPABILITY_IAM --stack-name SAM-AWSDBWorkshop2019 --parameter-overrides LambdaLayerNameParameter=aws-db-workshop-pg8000-layer DDBStreamName=$AWSDBWORKSHOP_DDB_STREAM_NAME SecurityGroupIds=$LAMBDASECURITYGROUP_ID VpcSubnetIds=$LAMBDASUBNET1_ID,$LAMBDASUBNET2_ID DatabaseName=$AURORADB_NAME DatabaseHostName=$AURORACLUSTERENDPOINT_NAME DatabaseUserName=$AURORADBMASTERUSER_NAME DatabasePassword=auradmin123 ``` >Note: This will take a few minutes. Ensure that the SAM template was successfully deployed. Look for the following line in the terminal as output >``` >Successfully created/updated stack - SAM-AWSDBWorkshop2019 in None >``` 7. Finally deactivate the virtual environment. ```shell script deactivate ``` Now you have successfully deployed the Lambda function. ## Taxi Ride Workflow In this section, you will run Python scripts to simulate booking of a taxi trip by a rider followed by acceptance and completion of the trip by a driver. After the trip is complete, you will run back-end SQL queries to process billing and driver payments. ### Taxi Trip Booking Workflow 1. Copy and paste the following commands to book a new trip as a rider. ```shell script cd ~/environment/amazon-rds-purpose-built-workshop/src/ddb-python-script/ python3 rider-book-trip.py ``` From the output of the script make a note of the _tripinfo_ value. You will be entering this value when prompted by the subsequent scripts. It's a randomly generated string that uniquely identifies a trip. It will be similar to >``` >"tripinfo": "2020-04-05T18:52:42Z,8987397" >``` 2. Copy and paste the following command as a driver to accept the trip. The script will prompt for the 'tripinfo' value. Enter the value (without double quotes for e.g. 2020-04-05T18:52:42Z,8987397) from the output of the previous Python script you just ran as a rider to book a new trip. ```shell script python3 driver-accept-trip.py ``` 3. Copy and paste the following command as a driver to complete the trip. The script will prompt for the 'tripinfo' value. Enter the same 'tripinfo' value (without double quotes for e.g. 2020-04-05T18:52:42Z,8987397) that you provided as input to the previous Python script you just ran as a driver to accept the trip. ```shell script python3 driver-complete-trip.py ``` >Note: The trip status is **_COMPLETE_** now and so the Lambda function would have picked the record from DynamoDB and inserted into Aurora PostgreSQL. ### Driver Billing and Payments 1. Copy and paste the commands below to connect to the Aurora PostgreSQL database. Enter the password string for the Aurora database when prompted. ```shell script AURORADB_NAME=$(aws cloudformation describe-stacks --stack-name $AWSDBWORKSHOP_CFSTACK_NAME | jq -r '.Stacks[].Outputs[] | select(.OutputKey=="AuroraDBName") | .OutputValue') echo $AURORADB_NAME AURORACLUSTERENDPOINT_NAME=$(aws cloudformation describe-stacks --stack-name $AWSDBWORKSHOP_CFSTACK_NAME | jq -r '.Stacks[].Outputs[] | select(.OutputKey=="AuroraClusterEndpointName") | .OutputValue') echo $AURORACLUSTERENDPOINT_NAME AURORADBMASTERUSER_NAME=$(aws cloudformation describe-stacks --stack-name $AWSDBWORKSHOP_CFSTACK_NAME | jq -r '.Stacks[].Outputs[] | select(.OutputKey=="AuroraDBMasterUser") | .OutputValue') echo $AURORADBMASTERUSER_NAME sudo psql -h $AURORACLUSTERENDPOINT_NAME -U $AURORADBMASTERUSER_NAME -d $AURORADB_NAME ``` 2. Execute the query below to review the trip information in the _trips_ table that you just completed in the previous section. ```sql select * from trips; ``` > The output of the query should have at-least 1 row in the trips table, similar to the output below. >``` > id | rider_id | driver_id | rider_name | rider_mobile | rider_email | trip_info | driver_name | driver_email | driver_mobile | vehicle_id | cab_type_id | vendor_id | pickup_datetime | dropoff_datetime | store_and_fwd_flag | rate_code_id | pickup_longitude | pickup_latitude | dropoff_longitude | dropoff_latitude | pa > ssenger_count | trip_distance | fare_amount | extra | mta_tax | tip_amount | tolls_amount | ehail_fee | improvement_surcharge | total_amount | payment_type | trip_type | pickup_location_id | dropoff_location_id | status > ---------+----------+-----------+-------------+--------------+-------------------------+-------------------------------------+--------------+-----------------------+---------------+------------+-------------+-----------+---------------------+---------------------+--------------------+--------------+------------------+-----------------+-------------------+------------------+--- > --------------+---------------+-------------+-------+---------+------------+--------------+-----------+-----------------------+--------------+--------------+-----------+--------------------+---------------------+----------- > 2000001 | 69257 | 528204 | person69257 | +11609467790 | person69257@example.com | 2019-12-18T05:15:33.640038Z,3219350 | driver528204 | driver528204@taxi.com | +11185992795 | PXX248130 | 2 | 2 | 2019-12-18 05:15:33 | 2019-12-18 05:19:10 | N | 4 | -73.496113 | 40.664146 | -73.527485 | 40.665024 | > 3 | 32 | 142.96 | 0.3 | 0.4 | 4.92 | 4.4 | 0 | 0.3 | 14.18 | 3 | 2 | 0 | 0 | Completed > 2000002 | 69257 | 507977 | person69257 | +11609467790 | person69257@example.com | 2019-12-18T05:31:13.478619Z,1747531 | driver507977 | driver507977@taxi.com | +11088418780 | XVJ356159 | 2 | 2 | 2019-12-18 05:31:13 | 2019-12-18 05:31:57 | N | 3 | -73.401165 | 40.866392 | -73.065379 | 40.96106 | > 4 | 8 | 55.39 | 1.0 | 0.4 | 8.57 | 2.25 | 0 | 0.8 | 127.75 | 3 | 2 | 0 | 0 | Completed > (2 rows) >``` 3. Execute the query below to insert driver billing information for all the drivers, for the current daily billing cycle based on the trip information in the _trips_ table. ```sql insert into billing (driver_id, billing_cycle, billing_start, billing_end, billing_amount, commissions, rides_total, description, billing_status) select driver_id, 2, current_date, current_date+1, sum(total_amount), 0.8, count(*), 'billing cycle 2', 'completed' from trips where dropoff_datetime < current_date+1 and dropoff_datetime > current_date group by driver_id; ``` > The query should insert at-least 1 row into the billing table and the output should be similar to the following. >``` > INSERT 0 1 >``` 4. Execute the query below to review the billing information that you just inserted. ```sql select * from billing where billing_cycle=2; ``` > The output of the query should retrieve at-least 1 row from the billing table, similar to the output below. >``` > id | driver_id | billing_cycle | billing_start | billing_end | billing_date | billing_amount | commissions | description | rides_total | billing_status > --------+-----------+---------------+---------------------+---------------------+----------------------------+----------------+-------------+-----------------+-------------+---------------- > 200001 | 510909 | 2 | 2019-09-15 00:00:00 | 2019-09-16 00:00:00 | 2019-09-16 01:59:05.634323 | 42.26 | 0.8 | billing cycle 2 | 1 | completed > (1 row) >``` 5. Execute the query below to insert driver payment information for all the drivers for the current billing cycle based on the billing information in the _billing_ table. ```sql insert into payment(billing_id,driver_id,billing_cycle,payment_amount,payment_date, payment_id, payment_status,description) select a.id, a.driver_id, a.billing_cycle,sum(a.billing_amount*a.commissions),a.billing_date, b.payment_id, 'completed','Payment cycle Jan 2020' from billing a, drivers b where a.driver_id=b.driver_id and a.billing_cycle = 2 and a.billing_status = 'completed' group by a.id, a.driver_id,b.payment_id, a.billing_cycle, a.billing_date; ``` > The query should insert at-least 1 row into the payment table and the output should be similar to the output below. >``` > INSERT 0 1 >``` 6. Execute the query below to review the payment information that you just inserted. ```sql select * from payment where description='Payment cycle Jan 2020'; ``` > The output of the query should retrieve at-least 1 row from the payment table, similar to the output below. >``` > id | billing_id | driver_id | billing_cycle | payment_amount | payment_date | payment_id | payment_status | description > --------+------------+-----------+---------------+----------------+----------------------------+------------+----------------+------------------------ > 200001 | 200001 | 510909 | 2 | 33.808 | 2019-09-16 01:59:05.634323 | 7 | completed | Payment cycle Jan 2020 > (1 row) >``` 7. Execute the following command to close the database connection. ```sql \q ``` **Congrats!!** You have successfully completed Lab 2. Now you can proceed to [Lab 3](../lab3-AthenaFederatedQuery/).