{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Deploy Fraud detector using code\n", "\n", "This notebook shows you how to use the sample template code base and deploy Amazon Frauddetector using mock training data. If you have your own training data, then you can substitute the mock data with your data." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Setup\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import os\n", "\n", "# Optional, if you want to change the default region or profile\n", "# os.environ[\"AWS_DEFAULT_REGION\"]='us-west-2'\n", "# os.environ[\"AWS_PROFILE\"]='default'" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# -- AWS stuff -- \n", "import boto3\n", "\n", "# Standard python\n", "import os\n", "import os, sys\n", "sys.path.append('./src')\n", "import datetime\n", "\n", "\n", "account_id = boto3.client('sts').get_caller_identity().get('Account')\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### ACTION: [SPECIFY BUCKET]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#Specify your bucket here and bucket prefix\n", "fd_data_bucket =\"databucket-mlopsdemo-202013\"\n", "fd_data_bucket_prefix = \"fraudetector-demo\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Configure fraud detector data access role\n", "\n", "Configure this role for fraud detector to have read access to your data. In this example policy, Amazon Fraud detector has read access to objects in bucket whose names contain *frauddetector*\n", "\n", "```json\n", "{\n", " \"Version\": \"2012-10-17\",\n", " \"Statement\": [\n", " {\n", " \"Sid\": \"BucketLevelAccess\",\n", " \"Effect\": \"Allow\",\n", " \"Action\": [\n", " \"s3:ListBucket\",\n", " \"s3:GetBucketLocation\"\n", " ],\n", " \"Resource\": \"arn:aws:s3:::*frauddetector*\"\n", " },\n", " {\n", " \"Sid\": \"ObjectAccess\",\n", " \"Effect\": \"Allow\",\n", " \"Action\": \"s3:GetObject\",\n", " \"Resource\": \"arn:aws:s3:::*frauddetector*/*\"\n", " },\n", " {\n", " \"Effect\": \"Allow\",\n", " \"Principal\": {\n", " \"Service\": \"frauddetector.amazonaws.com\"\n", " },\n", " \"Action\": \"sts:AssumeRole\"\n", " }\n", " ]\n", "}\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### ACTION: [SPECIFY ROLE]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Specify the role..\n", "frauddetector_role_name = \"DemoStackUSWest2-S1EnvironmentBootstrap-FraudDetectorRole\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### General config" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "\n", "s3_uri_data = \"s3://{}/{}\".format(fd_data_bucket, fd_data_bucket_prefix)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "\n", "frauddetector_role_arn=\"arn:aws:iam::{}:role/{}\".format(account_id, frauddetector_role_name)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import logging\n", "logging.basicConfig(level=\"INFO\", handlers=[logging.StreamHandler(sys.stdout)],\n", " format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "local_sample_file=os.path.join(\"src\", \"sample_data_variable_creation.csv\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Data Transformation\n", "\n", "This sample creates mock data to get started.." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%%time\n", "\n", "from main_demo_data_transformer import DemoDataTransformer\n", "\n", "transformer = DemoDataTransformer()\n", "transformer.run_pipeline(s3_destination_uri=s3_uri_data)\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "\n", "\n", "df = pd.read_csv(s3_uri_data + \"/train.csv\")\n", "df.sample(n=10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Mini sample data to dynamically create variables\n", "Create a mini copy of the sample data so that the Fraud Detector variables can be inferred from the data in the next step" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Save a copy small subset of data in src , so it can be used to create the Fraud detector variables\n", "transformer.write_csv(df.sample(n=10, random_state=42), local_sample_file)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Create model - Train with your dataset " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "\n", "model_name = \"frauddetector_demo_mlops\"\n", "model_version = None\n", "wait = True\n", "model_description=\"This is a demo model\"\n", "event_type_name = \"demoevent-mlops\"\n", "entity=\"democustomer-mlops\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "model_type = \"ONLINE_FRAUD_INSIGHTS\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%%time\n", "\n", "from core.fraud_detector_train import FraudDetectorTrain\n", "from features.feature_variables_dynamic import FeatureVariablesDynamic\n", "from core.fraud_detector_event import FraudDetectorEvent\n", "import pandas as pd\n", "\n", "# Step 1: Define features\n", "model_variables = FeatureVariablesDynamic(pd.read_csv(local_sample_file), true_labels=[1])\n", "\n", "# Step 2: Define event\n", "model_event = FraudDetectorEvent()\n", "model_event.create_event(event_type_name=event_type_name, description=\"This is a demo event\", entity=entity,\n", " model_variables=model_variables)\n", "\n", "# Step 3: Run training\n", "trainer = FraudDetectorTrain()\n", "model_details = trainer.run(model_name=model_name,model_type=model_type,event_type_name=event_type_name,\n", " role_arn = frauddetector_role_arn, model_variables=model_variables, \n", " model_description=model_description, \n", " s3_training_file=s3_uri_data, wait=True)\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "\n", "def get_model(model_name, model_version):\n", " client = boto3.client('frauddetector')\n", " \n", " response = client.describe_model_versions(\n", " modelId=model_name,\n", " modelVersionNumber=model_version,\n", " modelType='ONLINE_FRAUD_INSIGHTS')\n", " \n", " return response\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "model_version = model_details[\"modelVersionNumber\"]\n", "get_model(model_name, model_version)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Create Detector - To predictions on new data" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "model_to_deploy_name, model_to_deploy_version = model_name , model_version or \"1.0\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "detector_name = \"demo_detector\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from rules.detector_rule_model_score_positive import DetectorRuleModelScorePositive\n", "\n", "detector_rules = [\n", " DetectorRuleModelScorePositive(rule_id=\"positivescorerule\", model_name=model_to_deploy_name, threshold=2.6)\n", "]\n", " " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%%time\n", "\n", "from core.fraud_detector_model_based_deploy import FraudDetectorModelBasedDeploy\n", "\n", "deployer = FraudDetectorModelBasedDeploy()\n", "models_details = [\n", " {\"modelId\": model_to_deploy_name,\n", " \"modelDescription\": model_description,\n", " \"modelType\": model_type,\n", " \"modelVersionNumber\": model_to_deploy_version}\n", "]\n", "detector = deployer.deploy(detector_name=detector_name, \n", " detector_rules=detector_rules, \n", " event_type_name=event_type_name,\n", " detector_description=\"This is a demo detector\",\n", " model_versions=models_details)\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "detector" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Predict" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "detector_version = detector[\"detectorVersionId\"]\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def predict(detector_id, detector_version, event_id, attributes_dict, event_timestamp, event_type_name, entity, entity_id):\n", " client = boto3.client('frauddetector')\n", " response = client.get_event_prediction(\n", " detectorId= detector_id ,\n", " eventTypeName = event_type_name,\n", " detectorVersionId=detector_version,\n", " eventId = event_id,\n", " entities = [{'entityType':entity, 'entityId':entity_id}],\n", " eventTimestamp=event_timestamp,\n", " eventVariables=attributes_dict\n", ")\n", " return response" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from datetime import timezone, datetime\n", "\n", "predict(detector_name, detector_version=detector_version, \n", " event_id=\"1\", \n", " event_timestamp= datetime.utcnow().isoformat(timespec='seconds') + \"Z\",\n", " event_type_name=event_type_name, \n", " entity= entity,\n", " entity_id = \"demo1\",\n", " attributes_dict= {\"ip\": \"122.233.333.333\", \n", " \"email\":\"edum@a.com\", \n", " }\n", " )" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## CloudWatch metrics" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This shows you the list of models used in the last 7 days" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import json\n", "\n", "\n", "def get_models_used(used_in_days):\n", " import boto3\n", " from datetime import datetime, timedelta\n", "\n", "\n", " # Create CloudWatch client\n", " cloudwatch = boto3.client('cloudwatch')\n", "\n", " models_used_in_n_days = []\n", "\n", " # List ModelInvocation metric for AWS/FraudDetector\n", " paginator = cloudwatch.get_paginator('list_metrics')\n", " for response in paginator.paginate(MetricName='GetEventPrediction',\n", " Namespace='AWS/FraudDetector'):\n", "\n", " # For metric available query stats\n", " for metric in response[\"Metrics\"]:\n", " metric_dim = metric[\"Dimensions\"]\n", "\n", " resp_metric_stats = cloudwatch.get_metric_statistics(Namespace='AWS/FraudDetector',\n", " MetricName='GetEventPrediction',\n", " Dimensions=metric_dim,\n", " StartTime=datetime.now() - timedelta(days=used_in_days),\n", " EndTime=datetime.now(),\n", " Period=60 * 60 * 24 * used_in_days,\n", " Statistics=[\n", " 'SampleCount'\n", " ]\n", " )\n", "\n", " models_used_in_n_days.append({\"metric_dim\": metric_dim, \"data\": resp_metric_stats[\"Datapoints\"]})\n", "\n", " return models_used_in_n_days" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "models_used_in_last_n_days = get_models_used(used_in_days = 7)\n", "\n", "print(json.dumps(models_used_in_last_n_days, indent=1, default=str))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## [Optional] Delete demo detector + undeploy model\n", "This is so that you do not incur unexpected charges due to demo models running in your account" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from core.fraud_detector_undeploy import FraudDetectorUndeploy\n", "\n", "deploy = FraudDetectorUndeploy()\n", "deploy.delete_detector(detector_name)\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "deploy.undeploy_model(model_to_deploy_name, model_to_deploy_version)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.4" } }, "nbformat": 4, "nbformat_minor": 1 }