{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Part 3: Training and deploying an Amazon Fraud Detector model" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "

💡 NOTE

\n", "In order to execute this notebook, you must first execute the previous notebook 2-afd-model-setup.ipynb.\n", "\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Overview \n", "\n", "* [Notebook 1: Data Preparation, Process, and Store Features](./1-data-analysis-prep.ipynb)\n", "* [Notebook 2: Amazon Fraud Detector Model Setup](./2-afd-model-setup.ipynb)\n", "* **[Notebook 3: Model training, deployment, real-time and batch inference](./3-afd-model-train-deploy.ipynb)**\n", " * **[Introduction](#intro)**\n", " * **[Notebook Setup](#setup)**\n", " * **[Set region, boto3 and SageMaker SDK variables](#setup2)**\n", " * **[Create & Train your Model](#train)**\n", " * **[Check model performance and activate](#activate)**\n", " * **[Create rules and detector](#detector)**\n", " * **[Make Predictions](#predictions)**\n", " * **[Conclusion](#conclusion)**\n", "* [Notebook 4: Create an end-to-end pipeline](./4-afd-pipeline.ipynb)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 1. Introduction \n", "___\n", "overview" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Amazon Fraud Detector Online Fraud Insights is a supervised machine learning model designed to detect a variety of online fraud and risks. Because the model is supervised, it requires historical examples of fraudulent and legitimate events to train the model.\n", "\n", "The Online Fraud Insights model uses an ensemble of machine-learning algorithms for data enrichment, transformation, and fraud classification. As part of the model training process, Online Fraud Insights enriches raw data elements like IP address and BIN number with 3rd party data such as the geo-location of the IP address or the issuing bank for a credit card. In addition to 3rd party data, Online Fraud Insights uses deep learning algorithms leveraging fraud patterns seen at Amazon and AWS. These fraud patterns become input features to your model using a gradient tree boosting algorithm. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As part of this notebook, we will create and train an Amazon Fraud Detector [Online Fraud Insights](https://docs.aws.amazon.com/frauddetector/latest/ug/online-fraud-insights.html) model. Once a model is created and trained, we can create a detector which can be used to perform inferences.\n", "\n", "A detector is a rules-based categorization engine that predicts predefined outcomes based on user configuration. For this tutorial, you define the model score thresholds as rules for the detector.\n", "\n", "Models can either be trained within Amazon Fraud Detector using custom user data or they can be accessed from precreated Amazon Sagemaker endpoints.\n", "\n", "The high-level configuration flow is depicted in the following diagram.\n", "\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 2. Setup \n", "----\n", "overview\n", "\n", "As part of setup, we will retrieve those variables in order to use them here. We will also import some required libraries in this section." ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "data": { "text/html": [ "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from IPython.core.display import display, HTML\n", "from IPython.display import clear_output, JSON\n", "\n", "display(HTML(\"\"))\n", "# ------------------------------------------------------------------\n", "\n", "import numpy as np\n", "import matplotlib.pyplot as plt\n", "import pandas as pd\n", "import os\n", "import sys\n", "import time\n", "import json\n", "import uuid \n", "from datetime import datetime\n", "import boto3\n", "import sagemaker\n", "\n", "pd.set_option('display.max_rows', 500)\n", "pd.set_option('display.max_columns', 500)\n", "pd.set_option('display.width', 1000)\n", "\n", "# -- sklearn --\n", "from sklearn.metrics import roc_curve, roc_auc_score, auc, roc_auc_score\n", "%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The previous notebook [2-afd-model-setup.ipynb](./2-afd-model-setup.ipynb) defines and saves a few variables into the Jupyter cache using `%store` magic command. We will retrieve those variables to be used in this notebook. " ] }, { "cell_type": "code", "execution_count": 43, "metadata": {}, "outputs": [ { "data": { "text/html": [ "

Retrieved values:

" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/json": { "labelSchema": { "labelMapper": { "FRAUD": [ "fraud" ], "LEGIT": [ "legit" ] } }, "modelVariables": [ "ip_address", "email_address", "user_agent", "customer_name", "phone_number", "customer_city", "customer_postal", "customer_state", "customer_address" ] }, "text/plain": [ "" ] }, "metadata": { "application/json": { "expanded": false, "root": "root" } }, "output_type": "display_data" } ], "source": [ "# Retrieve cached variables\n", "\n", "%store -r ENTITY_TYPE\n", "%store -r ENTITY_DESC\n", "%store -r EVENT_TYPE\n", "%store -r EVENT_DESC\n", "%store -r MODEL_NAME\n", "%store -r MODEL_DESC\n", "%store -r DETECTOR_NAME\n", "%store -r DETECTOR_DESC\n", "%store -r S3_FILE_LOC\n", "%store -r trainingDataSchema\n", "%store -r eventVariables\n", "\n", "display(HTML(f\"

Retrieved values:

  • {ENTITY_TYPE}
  • {ENTITY_DESC}
  • {EVENT_TYPE}
  • {EVENT_DESC}
  • {MODEL_NAME}
  • {MODEL_DESC}
  • {DETECTOR_NAME}
  • {DETECTOR_DESC}
  • {S3_FILE_LOC}

\"))\n", "display(JSON(trainingDataSchema))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 3. Set region, boto3 and SageMaker SDK variables \n", "---\n", "overview" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We will initialize a Fraud Detector, S3 and Sagemaker Boto3 client objects." ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Using AWS Region: us-east-2\n" ] } ], "source": [ "#You can change this to a region of your choice\n", "region = sagemaker.Session().boto_region_name\n", "print(\"Using AWS Region: {}\".format(region))" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [], "source": [ "boto3.setup_default_session(region_name=region)\n", "\n", "boto_session = boto3.Session(region_name=region)\n", "\n", "# -- initialize S3 Client\n", "s3_client = boto3.client('s3', region_name=region)\n", "\n", "# -- initialize the AFD client \n", "client = boto3.client('frauddetector')\n", "\n", "sagemaker_boto_client = boto_session.client('sagemaker')\n", "\n", "sagemaker_session = sagemaker.session.Session(\n", " boto_session=boto_session,\n", " sagemaker_client=sagemaker_boto_client)\n", "\n", "# -- suffix is appended to detector and model name for uniqueness \n", "sufx = datetime.now().strftime(\"%Y%m%d\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We will get the SageMaker Execution Role " ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "SageMaker Role: AmazonSageMaker-ExecutionRole-20201030T135016\n" ] } ], "source": [ "print('SageMaker Role:', sagemaker.get_execution_role().split('/')[-1])\n", "\n", "ARN_ROLE = sagemaker.get_execution_role()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 4. Create & Train your Model \n", "-----\n", "overview\n", "\n", "An Amazon Fraud Detector model version is created by calling the [CreateModel](https://docs.aws.amazon.com/frauddetector/latest/api/API_CreateModel.html) and [CreateModelVersion](https://docs.aws.amazon.com/frauddetector/latest/api/API_CreateModelVersion.html) operations. \n", "\n", "* CreateModel initiates the model, which acts as a container for your model versions. \n", "* CreateModelVersion starts the training process, which results in a specific version of the model. A new version of the solution is created each time you call CreateModelVersion.\n", "\n", "As stated earlier, we will initialize a model of `modelType = ONLINE_FRAUD_INSIGHTS`.\n" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "data": { "text/html": [ "

Initialize Model

" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/json": { "ResponseMetadata": { "HTTPHeaders": { "connection": "keep-alive", "content-length": "2", "content-type": "application/x-amz-json-1.1", "date": "Thu, 13 May 2021 16:02:42 GMT", "x-amzn-requestid": "fbd96642-3943-424b-a836-fe17784c0255" }, "HTTPStatusCode": 200, "RequestId": "fbd96642-3943-424b-a836-fe17784c0255", "RetryAttempts": 0 } }, "text/plain": [ "" ] }, "metadata": { "application/json": { "expanded": false, "root": "root" } }, "output_type": "display_data" } ], "source": [ "# -- create AFD model --\n", "response = client.create_model(\n", " description = MODEL_DESC,\n", " eventTypeName = EVENT_TYPE,\n", " modelId = MODEL_NAME,\n", " modelType = 'ONLINE_FRAUD_INSIGHTS')\n", "\n", "# Display output\n", "display(HTML(\"

Initialize Model

\"))\n", "display(JSON(response))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we will train our first version using the `CreateModelVersion` API. `TrainingDataSource` and `ExternalEventsDetail` specify the source and Amazon S3 location of the training data set. `TrainingDataSchema` specifies how Amazon Fraud Detector should interpret the training data, specifically which event variables to include and how to classify the event labels.\n", "\n", "

💡 NOTE

\n", "Running the code cell below will kick-off the training process asynchronously. Training can take somewhere from 30 mins to 60 mins. At anypoint during the training, you can cancel the training by calling UpdateModelVersionStatus and updating the status to TRAINING_CANCELLED.\n", "
\n", "\n", "We will also need to provide an IAM Role ARN to the create_model_version call. This is different than the IAM Role that we have been using so far. The `dataAccessRoleArn` is an IAM role that should be assumable by Amazon Fraud Detector, so you will need to create a \"New\" IAM Role and assign it S3 read/write policy with a trust relationship with Amazon Fraud Detector. Here's what the IAM Policy looks like [**Note**: replace the bucket name appropriately]\n", "\n", "```json\n", "{\n", " \"Version\": \"2012-10-17\",\n", " \"Statement\": [\n", " {\n", " \"Action\": [\n", " \"s3:ListBucket\",\n", " \"s3:GetBucketLocation\"\n", " ],\n", " \"Effect\": \"Allow\",\n", " \"Resource\": [\n", " \"arn:aws:s3:::YOUR_BUCKET_NAME\"\n", " ]\n", " },\n", " {\n", " \"Action\": [\n", " \"s3:GetObject\"\n", " ],\n", " \"Effect\": \"Allow\",\n", " \"Resource\": [\n", " \"arn:aws:s3:::YOUR_BUCKET_NAME/*\"\n", " ]\n", " }\n", " ]\n", "}\n", "```" ] }, { "cell_type": "code", "execution_count": 63, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Stored 'DATA_ACCESS_ROLE_ARN' (str)\n" ] } ], "source": [ "# Replace with the Role ARN\n", "\n", "DATA_ACCESS_ROLE_ARN=\"arn:aws:iam::965425568475:role/service-role/AmazonFraudDetector-DataAccessRole-1620923454652\"\n", "%store DATA_ACCESS_ROLE_ARN\n" ] }, { "cell_type": "code", "execution_count": 44, "metadata": {}, "outputs": [ { "data": { "text/html": [ "

Model training

" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/json": { "ResponseMetadata": { "HTTPHeaders": { "connection": "keep-alive", "content-length": "132", "content-type": "application/x-amz-json-1.1", "date": "Thu, 13 May 2021 19:33:30 GMT", "x-amzn-requestid": "3dba7bcc-ea38-4371-b979-710fe3d0063d" }, "HTTPStatusCode": 200, "RequestId": "3dba7bcc-ea38-4371-b979-710fe3d0063d", "RetryAttempts": 0 }, "modelId": "afd_demo_model_20210512", "modelType": "ONLINE_FRAUD_INSIGHTS", "modelVersionNumber": "1.0", "status": "TRAINING_IN_PROGRESS" }, "text/plain": [ "" ] }, "metadata": { "application/json": { "expanded": false, "root": "root" } }, "output_type": "display_data" } ], "source": [ "\n", "response = client.create_model_version(\n", " modelId = MODEL_NAME,\n", " modelType = 'ONLINE_FRAUD_INSIGHTS',\n", " trainingDataSource = 'EXTERNAL_EVENTS',\n", " trainingDataSchema = trainingDataSchema,\n", " externalEventsDetail = {\n", " 'dataLocation' : S3_FILE_LOC,\n", " 'dataAccessRoleArn': DATA_ACCESS_ROLE_ARN\n", " }\n", ")\n", "\n", "# Display output\n", "display(HTML(\"

Model training

\"))\n", "display(JSON(response))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can check the progress of your model training by running the code cell below. Alternatively, you may also check the status of the training process via the Amazon Fraud Detector Console." ] }, { "cell_type": "code", "execution_count": 47, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2021-05-13 22:21:04.317898: Model status : TRAINING_COMPLETE\n", "\n", "Model training complete\n", "\n", "Elapsed time : 0.24937129020690918 seconds \n", "\n" ] }, { "data": { "application/json": { "ResponseMetadata": { "HTTPHeaders": { "connection": "keep-alive", "content-length": "791", "content-type": "application/x-amz-json-1.1", "date": "Thu, 13 May 2021 22:21:04 GMT", "x-amzn-requestid": "43dcf3c9-fc4b-42f9-8335-966ac1287339" }, "HTTPStatusCode": 200, "RequestId": "43dcf3c9-fc4b-42f9-8335-966ac1287339", "RetryAttempts": 0 }, "arn": "arn:aws:frauddetector:us-east-2:965425568475:model-version/ONLINE_FRAUD_INSIGHTS/afd_demo_model_20210512/1.0", "externalEventsDetail": { "dataAccessRoleArn": "arn:aws:iam::965425568475:role/service-role/AmazonFraudDetector-DataAccessRole-1620923454652", "dataLocation": "s3://sagemaker-us-east-2-965425568475/amazon-fraud-detector/training_data/afd_training_data.csv" }, "modelId": "afd_demo_model_20210512", "modelType": "ONLINE_FRAUD_INSIGHTS", "modelVersionNumber": "1.0", "status": "TRAINING_COMPLETE", "trainingDataSchema": { "labelSchema": { "labelMapper": { "FRAUD": [ "fraud" ], "LEGIT": [ "legit" ] } }, "modelVariables": [ "ip_address", "email_address", "user_agent", "customer_name", "phone_number", "customer_city", "customer_postal", "customer_state", "customer_address" ] }, "trainingDataSource": "EXTERNAL_EVENTS" }, "text/plain": [ "" ] }, "metadata": { "application/json": { "expanded": false, "root": "root" } }, "output_type": "display_data" } ], "source": [ "# Optional code cell\n", "from datetime import datetime\n", "\n", "print(\"Wait for model training to complete...\")\n", "stime = time.time()\n", "while True:\n", " current_time = datetime.now()\n", " clear_output(wait=True)\n", " response = client.get_model_version(modelId=MODEL_NAME, modelType = \"ONLINE_FRAUD_INSIGHTS\", modelVersionNumber = '1.0')\n", " if response['status'] == 'TRAINING_IN_PROGRESS':\n", " print(f\"{current_time}: current progress: {(time.time() - stime)/60:{3}.{3}} minutes\")\n", " time.sleep(60) # -- sleep for 60 seconds \n", " if response['status'] != 'TRAINING_IN_PROGRESS':\n", " print(f\"{current_time}: Model status : {response['status']}\")\n", " break\n", " \n", "etime = time.time()\n", "\n", "# -- summarize -- \n", "print(\"\\nModel training complete\")\n", "print(\"\\nElapsed time : %s\" % (etime - stime) + \" seconds \\n\" )\n", "display(JSON(response))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 5. Check model performance and activate \n", "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Once training is complete, the model version status willupdate to `TRAINING_COMPLETE`. You can review model performance using the Amazon Fraud Detector console or by calling [DescribeModelVersions](https://docs.aws.amazon.com/frauddetector/latest/api/API_DescribeModelVersions.html) API." ] }, { "cell_type": "code", "execution_count": 48, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "\n", "auc = client.describe_model_versions(\n", " modelId= MODEL_NAME,\n", " modelVersionNumber='1.0',\n", " modelType='ONLINE_FRAUD_INSIGHTS',\n", " maxResults=10\n", ")['modelVersionDetails'][0]['trainingResult']['trainingMetrics']['auc']\n", "\n", "\n", "df_model = pd.DataFrame(client.describe_model_versions(\n", " modelId= MODEL_NAME,\n", " modelVersionNumber='1.0',\n", " modelType='ONLINE_FRAUD_INSIGHTS',\n", " maxResults=10\n", ")['modelVersionDetails'][0]['trainingResult']['trainingMetrics']['metricDataPoints'])\n", "\n", "\n", "plt.figure(figsize=(10,10))\n", "plt.plot(df_model[\"fpr\"], df_model[\"tpr\"], color='darkorange',\n", " lw=2, label='ROC curve (area = %0.3f)' % auc)\n", "plt.plot([0, 1], [0, 1], color='navy', lw=2, linestyle='--')\n", "plt.xlabel('False Positive Rate')\n", "plt.ylabel('True Positive Rate')\n", "plt.title( MODEL_NAME + ' ROC Chart')\n", "plt.legend(loc=\"lower right\",fontsize=12)\n", "plt.axvline(x = 0.02 ,linewidth=2, color='r')\n", "plt.axhline(y = 0.73 ,linewidth=2, color='r')\n", "plt.show()" ] }, { "cell_type": "code", "execution_count": 49, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
fprprecisiontprthreshold
01.0000000.0504711.0000000.0
10.8852640.0563070.9937215.0
20.8135010.0606990.98901110.0
30.7607640.0645390.98744115.0
40.7190420.0680290.98744120.0
\n", "
" ], "text/plain": [ " fpr precision tpr threshold\n", "0 1.000000 0.050471 1.000000 0.0\n", "1 0.885264 0.056307 0.993721 5.0\n", "2 0.813501 0.060699 0.989011 10.0\n", "3 0.760764 0.064539 0.987441 15.0\n", "4 0.719042 0.068029 0.987441 20.0" ] }, "execution_count": 49, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_model.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Model Scores\n", "\n", "Amazon Fraud Detector generates model scores between 0 and 1000, where 0 is low fraud risk and1000 is high fraud risk. Model scores are directly related to the **false positive rate (FPR)**. For example, a score of 600 corresponds to an estimated 10% false positive rate whereas a score of 900 corresponds to an estimated 2% false positive rate. The following table provides details of how certain model scores correlate to estimated false positive rates.\n", "\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "After reviewing the model performance, activate the model to make it available to use by Detectorsin real-time fraud predictions. Amazon Fraud Detector will deploy the model in multiple availability zones for redundancy with auto-scaling turned on to ensure the model scales with the number of fraud predictions you are making. To activate the model, call the UpdateModelVersionStatus API andupdate the status to `ACTIVE`." ] }, { "cell_type": "code", "execution_count": 50, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2021-05-14 04:13:34.844916Model status : ACTIVE\n", "\n", "Elapsed time : 602.9988882541656 seconds \n", "\n" ] }, { "data": { "application/json": { "ResponseMetadata": { "HTTPHeaders": { "connection": "keep-alive", "content-length": "780", "content-type": "application/x-amz-json-1.1", "date": "Fri, 14 May 2021 04:13:35 GMT", "x-amzn-requestid": "7b6a35b3-7347-473c-9b4f-1e41489d5966" }, "HTTPStatusCode": 200, "RequestId": "7b6a35b3-7347-473c-9b4f-1e41489d5966", "RetryAttempts": 0 }, "arn": "arn:aws:frauddetector:us-east-2:965425568475:model-version/ONLINE_FRAUD_INSIGHTS/afd_demo_model_20210512/1.0", "externalEventsDetail": { "dataAccessRoleArn": "arn:aws:iam::965425568475:role/service-role/AmazonFraudDetector-DataAccessRole-1620923454652", "dataLocation": "s3://sagemaker-us-east-2-965425568475/amazon-fraud-detector/training_data/afd_training_data.csv" }, "modelId": "afd_demo_model_20210512", "modelType": "ONLINE_FRAUD_INSIGHTS", "modelVersionNumber": "1.0", "status": "ACTIVE", "trainingDataSchema": { "labelSchema": { "labelMapper": { "FRAUD": [ "fraud" ], "LEGIT": [ "legit" ] } }, "modelVariables": [ "ip_address", "email_address", "user_agent", "customer_name", "phone_number", "customer_city", "customer_postal", "customer_state", "customer_address" ] }, "trainingDataSource": "EXTERNAL_EVENTS" }, "text/plain": [ "" ] }, "metadata": { "application/json": { "expanded": false, "root": "root" } }, "output_type": "display_data" } ], "source": [ "from datetime import datetime\n", "\n", "response = client.update_model_version_status (\n", " modelId = MODEL_NAME,\n", " modelType = 'ONLINE_FRAUD_INSIGHTS',\n", " modelVersionNumber = '1.0',\n", " status = 'ACTIVE'\n", ")\n", "print(\"Activating model...\")\n", "print(response)\n", "\n", "#-- wait until model is active \n", "print(\"Waiting until model status is active \")\n", "stime = time.time()\n", "while True:\n", " current_time = datetime.now()\n", " clear_output(wait=True)\n", " response = client.get_model_version(modelId=MODEL_NAME, modelType = \"ONLINE_FRAUD_INSIGHTS\", modelVersionNumber = '1.0')\n", " if response['status'] != 'ACTIVE':\n", " print(f\"{current_time}: current progress: {(time.time() - stime)/60:{3}.{3}} minutes\")\n", " time.sleep(60) # sleep for 1 minute \n", " if response['status'] == 'ACTIVE':\n", " print(f\"{current_time}Model status : {response['status']}\")\n", " break\n", " \n", "etime = time.time()\n", "print(\"\\nElapsed time : %s\" % (etime - stime) + \" seconds \\n\" )\n", "display(JSON(response))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 6. Create rules and detector \n", "-----\n", "overview" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A detector contains the detection logic, such as the models and rules, for a particular event that you want to evaluate for fraud. Each detector can evaluate one event type. A detector can have multiple versions, with each version having a status of `DRAFT`, `ACTIVE`, or `INACTIVE`. Only one detector version can be in ACTIVE status at a time.\n", "\n", "A detector acts as a container for your detector versions. The [PutDetector](https://docs.aws.amazon.com/frauddetector/latest/api/API_PutDetector.html) API specifies what event type the detector will evaluate." ] }, { "cell_type": "code", "execution_count": 51, "metadata": {}, "outputs": [ { "data": { "application/json": { "ResponseMetadata": { "HTTPHeaders": { "connection": "keep-alive", "content-length": "2", "content-type": "application/x-amz-json-1.1", "date": "Fri, 14 May 2021 04:13:46 GMT", "x-amzn-requestid": "9df3c550-7506-48c4-b713-49fa746cbe13" }, "HTTPStatusCode": 200, "RequestId": "9df3c550-7506-48c4-b713-49fa746cbe13", "RetryAttempts": 0 } }, "text/plain": [ "" ] }, "metadata": { "application/json": { "expanded": false, "root": "root" } }, "output_type": "display_data" } ], "source": [ "\n", "response = client.put_detector(detectorId = DETECTOR_NAME, \n", " description = DETECTOR_DESC, \n", " eventTypeName = EVENT_TYPE )\n", "\n", "display(JSON(response))\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Before we can create a detector version, we will need to create a set of rules. A rule is a condition that tells Amazon Fraud Detector how to interpret variable values during a fraud prediction. A rule consists of one or more variables, a logic expression, and one or more outcomes. A detector must have at least one associated rule. Rules in a detector are evaluated as part of a fraud prediction. We will start by defining outcomes first." ] }, { "cell_type": "code", "execution_count": 52, "metadata": {}, "outputs": [], "source": [ "outcome_list = [\n", " {\n", " \"name\": 'verify_customer',\n", " \"desc\": 'this outcome initiates a verification workflow'\n", " }, \n", " {\n", " \"name\": 'review',\n", " \"desc\": 'this outcome sidelines event for human or automated review'\n", " }, \n", " {\n", " \"name\": 'approve',\n", " \"desc\": 'this outcome approves the event'\n", " }\n", "]" ] }, { "cell_type": "code", "execution_count": 53, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Outcome verify_customer already exists ...\n", "Outcome review already exists ...\n", "Outcome approve already exists ...\n" ] } ], "source": [ "# Generate outcomes\n", "for outcome in outcome_list:\n", " try:\n", " client.get_outcomes(name = outcome['name'])\n", " print(f\"Outcome {outcome['name']} already exists ...\")\n", " except Exception as e:\n", " print(f\"Creating outcome: {outcome['name']} ...\")\n", " client.put_outcome(name = outcome['name'],\n", " description = outcome['desc'])\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now that we have defined three possible outcomes -- `verify_customer`, `review`, and `approve` -- we will define the corresponding rules by setting up a calculation expression for each rule and then map each rule to each of the defined outcomes. The following code cell generates rules based on the false positive rate (FPR) score of the model. It considers fpr scores of between 1% and 6% only and defines the rules based on the threshold and maps a corresponding outcome to that rule. We will use output of this code cell to \n", "\n", "1. Create the individual rules using the [CreateRule](https://docs.aws.amazon.com/frauddetector/latest/api/API_CreateRule.html) API\n", "2. Generate a list of rules to be used by the detector version" ] }, { "cell_type": "code", "execution_count": 54, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Score thresholds 1% to 6% ...\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
fprtprthresholdruleoutcome
00.010.72970.0$afd_demo_model_20210512_insightscore > 970.0review
10.020.85920.0$afd_demo_model_20210512_insightscore > 920.0review
20.030.89880.0$afd_demo_model_20210512_insightscore > 880.0review
30.040.92835.0$afd_demo_model_20210512_insightscore > 835.0verify_customer
40.050.92795.0$afd_demo_model_20210512_insightscore > 795.0verify_customer
50.060.93750.0$afd_demo_model_20210512_insightscore <= 795.0approve
\n", "
" ], "text/plain": [ " fpr tpr threshold rule outcome\n", "0 0.01 0.72 970.0 $afd_demo_model_20210512_insightscore > 970.0 review\n", "1 0.02 0.85 920.0 $afd_demo_model_20210512_insightscore > 920.0 review\n", "2 0.03 0.89 880.0 $afd_demo_model_20210512_insightscore > 880.0 review\n", "3 0.04 0.92 835.0 $afd_demo_model_20210512_insightscore > 835.0 verify_customer\n", "4 0.05 0.92 795.0 $afd_demo_model_20210512_insightscore > 795.0 verify_customer\n", "5 0.06 0.93 750.0 $afd_demo_model_20210512_insightscore <= 795.0 approve" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "model_stat = df_model.round(decimals=2) \n", "\n", "m = model_stat.loc[model_stat.groupby([\"fpr\"])[\"threshold\"].idxmax()] \n", "\n", "def make_rule(x):\n", " rule = \"\"\n", " if x['fpr'] <= 0.05: \n", " rule = f\"${MODEL_NAME}_insightscore > {x['threshold']}\"\n", " if x['fpr'] == 0.06:\n", " rule = f\"${MODEL_NAME}_insightscore <= {x['threshold_prev']}\"\n", " return rule\n", " \n", "m[\"threshold_prev\"] = m['threshold'].shift(1)\n", "m['rule'] = m.apply(lambda x: make_rule(x), axis=1)\n", "\n", "m['outcome'] = \"approve\"\n", "m.loc[m['fpr'] <= 0.03, \"outcome\"] = \"review\"\n", "m.loc[(m['fpr'] > 0.03) & (m['fpr'] <= 0.05), \"outcome\"] = \"verify_customer\"\n", "\n", "print (\"Score thresholds 1% to 6% ...\")\n", "display(m[[\"fpr\", \"tpr\", \"threshold\", \"rule\", \"outcome\"]].loc[(m['fpr'] > 0.0 ) & (m['fpr'] <= 0.06)].reset_index(drop=True))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create rules using [CreateRule](https://docs.aws.amazon.com/frauddetector/latest/api/API_CreateRule.html) API and generate rule list" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import botocore\n", "\n", "rule_set = m[(m[\"fpr\"] > 0.0) & (m[\"fpr\"] <= 0.06)][[\"outcome\", \"rule\"]].to_dict('records')\n", "rule_list = []\n", "for i, rule in enumerate(rule_set):\n", " ruleId = \"rule{0}_{1}\".format(i, MODEL_NAME)\n", " rule_list.append({\"ruleId\": ruleId, \n", " \"ruleVersion\" : '1',\n", " \"detectorId\" : DETECTOR_NAME\n", " \n", " })\n", " \n", " try: \n", " response = client.create_rule(\n", " ruleId = ruleId,\n", " detectorId = DETECTOR_NAME,\n", " expression = rule['rule'],\n", " language = 'DETECTORPL',\n", " outcomes = [rule['outcome']]\n", " )\n", " print(f\"Creating rule: {ruleId}: IF {rule['rule']} THEN {rule['outcome']}\")\n", "# except client.exceptions.ValidationException as error:\n", " except botocore.exceptions.ClientError as error:\n", " print(f\"Rule {ruleId} already exists in this detector...\")\n", " print(error.response['Error']['Message'])\n", " except Exception as e:\n", " print(e)\n", " " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now that we have created our rules, we will create a detector version using the detector we created earlier. A detector version defines the specific models and rules that will be run as part of the [GetEventPrediction](https://docs.aws.amazon.com/frauddetector/latest/api/API_GetEventPrediction.html) request. You can add any of the rules defined within a detector to the detector version. You can also add any model trained on the evaluated event type. Each detector version has a status of `DRAFT`, `ACTIVE`, or `INACTIVE`. Only one detector version can be in ACTIVE status at a time. During the GetEventPrediction request, Amazon Fraud Detector will use the `ACTIVE` detector if no `DetectorVersion` is specified.\n", "\n", "Amazon Fraud Detector supports two different rule execution modes: FIRST_MATCHED andALL_MATCHED.\n", "\n", "* If the rule execution mode is `FIRST_MATCHED`, Amazon Fraud Detector evaluates rules sequentially, first to last, stopping at the first matched rule. Amazon Fraud Detector then provides the outcomes for that single rule. If a rule evaluates to false (not matched), the next rule in the list is evaluated.\n", "* If the rule execution mode is `ALL_MATCHED`, then all rules in an evaluation are executed in parallel,regardless of their order. Amazon Fraud Detector executes all rules and returns the defined outcomes for every matched rule." ] }, { "cell_type": "code", "execution_count": 64, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Detector version created ... \n" ] }, { "data": { "application/json": { "ResponseMetadata": { "HTTPHeaders": { "connection": "keep-alive", "content-length": "79", "content-type": "application/x-amz-json-1.1", "date": "Tue, 18 May 2021 22:45:32 GMT", "x-amzn-requestid": "b78e492d-77cb-4ea9-a29f-4dd06a2cf7f2" }, "HTTPStatusCode": 200, "RequestId": "b78e492d-77cb-4ea9-a29f-4dd06a2cf7f2", "RetryAttempts": 0 }, "detectorId": "afd_detector_20210512", "detectorVersionId": "2", "status": "DRAFT" }, "text/plain": [ "" ] }, "metadata": { "application/json": { "expanded": false, "root": "root" } }, "output_type": "display_data" } ], "source": [ "\n", "response = client.create_detector_version(detectorId = DETECTOR_NAME,\n", " rules = rule_list,\n", " modelVersions = [\n", " {\n", " \"modelId\":MODEL_NAME, \n", " \"modelType\" : \"ONLINE_FRAUD_INSIGHTS\",\n", " \"modelVersionNumber\" : \"1.0\"\n", " }\n", " ],\n", " ruleExecutionMode = 'FIRST_MATCHED'\n", " )\n", "print(\"Detector version created ... \")\n", "display(JSON(response))\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To update the status of a detector version, use the [UpdateDetectorVersionStatus](https://docs.aws.amazon.com/frauddetector/latest/api/API_UpdateDetectorVersionStatus.html) API." ] }, { "cell_type": "code", "execution_count": 71, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Detector activated ... \n" ] }, { "data": { "application/json": { "ResponseMetadata": { "HTTPHeaders": { "connection": "keep-alive", "content-length": "2", "content-type": "application/x-amz-json-1.1", "date": "Tue, 18 May 2021 22:58:39 GMT", "x-amzn-requestid": "14caafa4-92fc-422c-9554-b2bcb93f1d6f" }, "HTTPStatusCode": 200, "RequestId": "14caafa4-92fc-422c-9554-b2bcb93f1d6f", "RetryAttempts": 0 } }, "text/plain": [ "" ] }, "metadata": { "application/json": { "expanded": false, "root": "root" } }, "output_type": "display_data" } ], "source": [ "\n", "response = client.update_detector_version_status(detectorId= DETECTOR_NAME,\n", " detectorVersionId='1',\n", " status='ACTIVE'\n", " )\n", "print(\"Detector activated ... \")\n", "display(JSON(response))\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 7. Make Predictions \n", "-----\n", "overview\n", "\n", "Our Amazon Fraud Detector model is now ready to make predictions. There are two modes you can run inference on Amazon Fraud Detector model to get predictions -\n", "\n", "1. Real-time Prediction\n", "2. Batch Prediction\n", "\n", "The following section will apply your detector to the first 10 records in your training dataset in real-time prediction mode. Ideally, real-time prediction is suitable for applications that need real-time evaluation of fraud for example web or mobile app signup page. A common architecture of a real-time fraud detection service would be to utilize the `GetEventPrediction` API from a [Lambda function](https://docs.aws.amazon.com/lambda/latest/dg/welcome.html) which gets real-time fraud detection request from a web/mobile app via an [API Gateway](https://docs.aws.amazon.com/apigateway/latest/developerguide/welcome.html) endpoint.\n", "\n", "#### 7.1 Real-time predictions\n", "---\n", "The code below loops through the first 10 records in our training data and runs predictions on them." ] }, { "cell_type": "code", "execution_count": 61, "metadata": {}, "outputs": [], "source": [ "# -- this will apply your detector to the first 10 records of your trainig dataset. -- \n", "record_count = 10 \n", "predicted_dat = []\n", "dateTimeObj = datetime.now()\n", "timestampStr = dateTimeObj.strftime(\"%Y-%m-%dT%H:%M:%SZ\")\n", "pred_data = df[eventVariables].head(record_count).astype(str).to_dict(orient='records')\n", "\n", "for rec in pred_data:\n", " eventId = uuid.uuid1()\n", " pred = client.get_event_prediction(detectorId=DETECTOR_NAME, \n", " detectorVersionId='1',\n", " eventId = str(eventId),\n", " eventTypeName = EVENT_TYPE,\n", " eventTimestamp = timestampStr, \n", " entities = [\n", " {\n", " 'entityType': ENTITY_TYPE, \n", " 'entityId':str(eventId.int)\n", " }\n", " ],\n", " eventVariables=rec) \n", " \n", " rec[\"score\"] = pred['modelScores'][0]['scores'][f\"{MODEL_NAME}_insightscore\"]\n", " rec[\"outcome\"] = pred['ruleResults'][0]['outcomes']\n", " predicted_dat.append(rec)" ] }, { "cell_type": "code", "execution_count": 62, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ip_addressemail_addressuser_agentcustomer_namephone_numbercustomer_citycustomer_postalcustomer_statecustomer_addressscoreoutcome
0168.49.13.67synth_emily31@gmail.comOpera/8.80.(Windows NT 5.0; ia-FR) Presto/2.9....Tabitha Rodgers(555)416 - 4931West Brenda33426.0WA8252 Garcia Pass Suite 14838.0[approve]
1212.201.60.16synth_kjohnson@hotmail.comMozilla/5.0 (iPad; CPU iPad OS 9_3_5 like Mac ...Jeremy Scott(555)514 - 7683New Sethberg32170.0ND201 Bowen Harbor43.0[approve]
2116.14.61.213synth_swansonchristopher@gmail.comMozilla/5.0 (Android 4.0; Mobile; rv:27.0) Gec...Whitney Cabrera(555)858 - 6712Jacksonborough32994.0CO524 Burke Inlet281.0[approve]
3192.89.123.199synth_ufox@yahoo.comMozilla/5.0 (iPad; CPU iPad OS 5_1_1 like Mac ...Timothy Campos(555)349 - 6656Christophershire34828.0AK41584 Kelli Mount26.0[approve]
4192.88.215.76synth_sbrewer@yahoo.comOpera/9.75.(X11; Linux i686; iu-CA) Presto/2.9...Justin Mathews(555)519 - 3973Richardside32519.0OR6807 Joseph Hill31.0[approve]
5203.0.18.22synth_sparkstimothy@yahoo.comMozilla/5.0 (compatible; MSIE 6.0; Windows NT ...John Bright(555)620 - 4924Lake Carol32851.0NY0226 David Groves Suite 95833.0[approve]
6198.51.97.207synth_justin87@hotmail.comMozilla/5.0 (iPad; CPU iPad OS 6_1_6 like Mac ...Crystal Williams(555)270 - 3596Brandonbury34476.0WA5130 Brian Row35.0[approve]
7192.30.212.77synth_pattonjames@gmail.comOpera/9.90.(X11; Linux i686; ak-GH) Presto/2.9...Robert Berry(555)817 - 8201Keithchester33288.0OR79636 Tricia Cape Suite 13549.0[approve]
8203.7.244.65synth_debbie01@hotmail.comMozilla/5.0 (iPad; CPU iPad OS 10_3_3 like Mac...Douglas Lambert MD(555)320 - 5839East Paigehaven32006.0WA44509 Hall Ford Suite 19827.0[approve]
9198.51.58.6synth_chelseaboyer@gmail.comOpera/9.53.(Windows 98; tr-CY) Presto/2.9.162 ...David Hall(555)248 - 9638Lake Brandonchester32782.0CT1149 Jones Spurs Suite 787407.0[approve]
\n", "
" ], "text/plain": [ " ip_address email_address user_agent customer_name phone_number customer_city customer_postal customer_state customer_address score outcome\n", "0 168.49.13.67 synth_emily31@gmail.com Opera/8.80.(Windows NT 5.0; ia-FR) Presto/2.9.... Tabitha Rodgers (555)416 - 4931 West Brenda 33426.0 WA 8252 Garcia Pass Suite 148 38.0 [approve]\n", "1 212.201.60.16 synth_kjohnson@hotmail.com Mozilla/5.0 (iPad; CPU iPad OS 9_3_5 like Mac ... Jeremy Scott (555)514 - 7683 New Sethberg 32170.0 ND 201 Bowen Harbor 43.0 [approve]\n", "2 116.14.61.213 synth_swansonchristopher@gmail.com Mozilla/5.0 (Android 4.0; Mobile; rv:27.0) Gec... Whitney Cabrera (555)858 - 6712 Jacksonborough 32994.0 CO 524 Burke Inlet 281.0 [approve]\n", "3 192.89.123.199 synth_ufox@yahoo.com Mozilla/5.0 (iPad; CPU iPad OS 5_1_1 like Mac ... Timothy Campos (555)349 - 6656 Christophershire 34828.0 AK 41584 Kelli Mount 26.0 [approve]\n", "4 192.88.215.76 synth_sbrewer@yahoo.com Opera/9.75.(X11; Linux i686; iu-CA) Presto/2.9... Justin Mathews (555)519 - 3973 Richardside 32519.0 OR 6807 Joseph Hill 31.0 [approve]\n", "5 203.0.18.22 synth_sparkstimothy@yahoo.com Mozilla/5.0 (compatible; MSIE 6.0; Windows NT ... John Bright (555)620 - 4924 Lake Carol 32851.0 NY 0226 David Groves Suite 958 33.0 [approve]\n", "6 198.51.97.207 synth_justin87@hotmail.com Mozilla/5.0 (iPad; CPU iPad OS 6_1_6 like Mac ... Crystal Williams (555)270 - 3596 Brandonbury 34476.0 WA 5130 Brian Row 35.0 [approve]\n", "7 192.30.212.77 synth_pattonjames@gmail.com Opera/9.90.(X11; Linux i686; ak-GH) Presto/2.9... Robert Berry (555)817 - 8201 Keithchester 33288.0 OR 79636 Tricia Cape Suite 135 49.0 [approve]\n", "8 203.7.244.65 synth_debbie01@hotmail.com Mozilla/5.0 (iPad; CPU iPad OS 10_3_3 like Mac... Douglas Lambert MD (555)320 - 5839 East Paigehaven 32006.0 WA 44509 Hall Ford Suite 198 27.0 [approve]\n", "9 198.51.58.6 synth_chelseaboyer@gmail.com Opera/9.53.(Windows 98; tr-CY) Presto/2.9.162 ... David Hall (555)248 - 9638 Lake Brandonchester 32782.0 CT 1149 Jones Spurs Suite 787 407.0 [approve]" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "predictions = pd.DataFrame(predicted_dat)\n", "display(predictions)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### 7.2 Batch Predictions\n", "---\n", "You may also perform batch predictions via the [CreateBatchPredictionJob](https://docs.aws.amazon.com/frauddetector/latest/api/API_CreateBatchPredictionJob.html) API. With this method you may pass a file with input data, located in an S3 bucket specified by the `inputPath` parameter, to the prediction job. The prediction output will be eventually written into the S3 location specified in the `outputPath` parameter." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import s3fs\n", "\n", "%store -r afd_bucket\n", "%store -r afd_prefix\n", "\n", "df = pd.read_csv(S3_FILE_LOC)\n", "\n", "PREDICTION_OUTPUT_PATH=f\"s3://{afd_bucket}/{afd_prefix}/batch_prediction/outcomes.csv\"\n", "\n", "# Note you must prepare an input csv file without the EVENT_LABEL and EVENT_TIMESTAMP columns for example df[eventVariables]\n", "# this file is not provided in this tutorial but please feel free to create one using the \"df\" dataframe \n", "PREDICTION_INPUT_PATH=f\"s3://{afd_bucket}/{afd_prefix}/batch_prediction/input.csv\" \n", "\n", "JOB_ID=f'{MODEL_NAME}_{sufx}'\n", "\n", "response = client.create_batch_prediction_job(jobId=JOB_ID,\n", " inputPath=S3_FILE_LOC,\n", " outputPath=PREDICTION_OUTPUT_PATH,\n", " eventTypeName=EVENT_TYPE,\n", " detectorName=DETECTOR_NAME,\n", " detectorVersion='1',\n", " iamRoleArn=ARN_ROLE)\n", "print(\"Batch Prediction job created ... \")\n", "display(JSON(response))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Wait for Batch prediction job to complete" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(\"Waiting until job status is complete \")\n", "stime = time.time()\n", "while True:\n", " current_time = datetime.now()\n", " clear_output(wait=True)\n", " response = client.get_batch_prediction_jobs(jobId=JOB_ID,\n", " maxResults=1)\n", " status = response['batchPredictions'][0]['status']\n", " if status != 'COMPLETE':\n", " print(f\"{current_time}: current progress: {(time.time() - stime)/60:{3}.{3}} minutes\")\n", " time.sleep(60) # sleep for 1 minute \n", " if response['status'] == 'COMPLETE':\n", " print(f\"{current_time} Batch Prediction Job status : {status}\")\n", " break\n", " \n", "etime = time.time()\n", "print(\"\\nElapsed time : %s\" % (etime - stime) + \" seconds \\n\" )" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Load and analyze predictions" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_pred = pd.read_csv(PREDICTION_OUTPUT_PATH)\n", "df_pred.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 8. Conclusion \n", "-----\n", "overview\n", "\n", "In this notebook \n", "\n", "* We created and trained our Amazon Fraud detector model\n", "* We analyzed the model's performance by looking at the False Positive Rate\n", "* We created rules, detector and detector version\n", "* Did real-time and batch predictions using our trained Amazon Fraud Detector model" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "instance_type": "ml.t3.medium", "kernelspec": { "display_name": "Python 3 (Data Science)", "language": "python", "name": "python3__SAGEMAKER_INTERNAL__arn:aws:sagemaker:us-east-2:429704687514:image/datascience-1.0" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.10" } }, "nbformat": 4, "nbformat_minor": 4 }