{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "99b69bc8",
   "metadata": {},
   "source": [
    "##  Restore Variables"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a7a3550c",
   "metadata": {},
   "outputs": [],
   "source": [
    "%store -r"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "eb0a8f5a",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Deploying to a Model Endpoint"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "afeab85a",
   "metadata": {},
   "outputs": [],
   "source": [
    "%%bash -s \"$NEPTUNE_ENDPOINT\" \"$REGION\" \"$TRAINING_ID\" \"$ENDPOINT_ID\"\n",
    "\n",
    "awscurl --region $2 --service neptune-db -X POST https://$1:8182/ml/endpoints -H 'Content-Type: application/json' -d '\n",
    "    {\n",
    "        \"id\" : \"'${4}'\",\n",
    "        \"mlModelTrainingJobId\": \"'${3}'\"\n",
    "      }'\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1d67b1bb",
   "metadata": {},
   "source": [
    "## Get the Endpoint Name"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e89b9275",
   "metadata": {},
   "outputs": [],
   "source": [
    "%%bash -s \"$NEPTUNE_ENDPOINT\" \"$ENDPOINT_ID\" --out endpoint_response\n",
    "\n",
    "curl -s https://${1}:8182/ml/endpoints/${2}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "92f92549",
   "metadata": {},
   "outputs": [],
   "source": [
    "import ast\n",
    "endpoint_name = ast.literal_eval(endpoint_response)['endpoint']['name']"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f8ddbcf2",
   "metadata": {},
   "source": [
    "While the model is being deployed, let's get visualize the computed embeddings"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5c3d4142",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Visualization\n",
    "\n",
    "During the model training, Neptune ML will work on producing predictions and calculating the node embeddings then save them in the training output location. In the next cell, we use a helper library to download an visualize the node embeddings saved by Neptune ML"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f871cefb",
   "metadata": {},
   "source": [
    "#### Define the graph notebook config\n",
    "This will be used by the helper library to get the right information about the model trained\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "db3c5a32",
   "metadata": {},
   "outputs": [],
   "source": [
    "import json\n",
    "neptune_config = {\n",
    "  \"host\": NEPTUNE_ENDPOINT,\n",
    "  \"port\": 8182,\n",
    "  \"auth_mode\": \"DEFAULT\",\n",
    "  \"load_from_s3_arn\": NEPTUNE_LOAD_ROLE,\n",
    "  \"ssl\": True,\n",
    "  \"aws_region\": REGION,\n",
    "  \"sparql\": {\n",
    "    \"path\": \"sparql\"\n",
    "  }\n",
    "}\n",
    "neptune_config_json = json.dumps(neptune_config, indent = 4)\n",
    "\n",
    "with open('/home/ec2-user/graph_notebook_config.json', 'w') as file:\n",
    "    file.write(neptune_config_json)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b9f53cbc",
   "metadata": {},
   "source": [
    "#### Download the generated embeddings and predictions"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d8942d1c",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "outputs": [],
   "source": [
    "import neptune_ml_utils as neptune_ml\n",
    "\n",
    "transaction_mapping = neptune_ml.get_node_to_idx_mapping(dataprocessing_job_name=PROCESSING_ID,vertex_label=\"Transaction\")\n",
    "embeddings = neptune_ml.get_embeddings(training_job_name=TRAINING_ID)\n",
    "predictions = neptune_ml.get_predictions(training_job_name=TRAINING_ID, class_preds=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fd2ba55f",
   "metadata": {},
   "source": [
    "#### Reduce the embeddings dimensions for visulaization"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "84c85aa9",
   "metadata": {},
   "outputs": [],
   "source": [
    "from sklearn.decomposition import PCA\n",
    "from sklearn.manifold import TSNE\n",
    "%matplotlib inline\n",
    "import matplotlib.pyplot as plt\n",
    "from mpl_toolkits.mplot3d import Axes3D\n",
    "import seaborn as sns\n",
    "\n",
    "pca = PCA(n_components=3)\n",
    "pca_result = pca.fit_transform(embeddings)\n",
    "\n",
    "pcaone = pca_result[:,0]\n",
    "pcatwo = pca_result[:,1] \n",
    "pcathree = pca_result[:,2]\n",
    "print('Explained variation per principal component: {}'.format(pca.explained_variance_ratio_))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d42a536a",
   "metadata": {},
   "source": [
    "#### Plot the embeddings in 2D graph"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "45d3d75a",
   "metadata": {},
   "outputs": [],
   "source": [
    "fig = plt.figure(figsize=(16,12))\n",
    "fig.suptitle(\"2D representation of node embeddings\")\n",
    "\n",
    "scatter = plt.scatter(pcaone, pcatwo, c=predictions)\n",
    "plt.legend(*scatter.legend_elements(), title=\"isFraud\", loc=\"upper right\")\n",
    "plt.grid()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a267646e",
   "metadata": {},
   "source": [
    "## Invoke the Deployed Endpoint\n",
    "\n",
    "Since Neptune ML will deploy an endpoint using Amazon SageMaker, you can also invoke the SageMaker endpoint and generate the score for the fraud label"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6e0b53d6",
   "metadata": {},
   "outputs": [],
   "source": [
    "import json\n",
    "import boto3\n",
    "client = boto3.client('runtime.sagemaker')\n",
    "data = {\"vertices\": idss, \"topk\": 1, \"property\": \"isFraud\"} \n",
    "response = client.invoke_endpoint(EndpointName=endpoint_name,\n",
    "                                  Body=json.dumps(data))\n",
    "response_body = response['Body'] \n",
    "res = json.loads(response_body.read())\n",
    "results = []\n",
    "for i in res['output']['nodes']:\n",
    "    results.append(i['mlResults'][0]['inferredValue'])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bd592861",
   "metadata": {},
   "source": [
    "#### Get Original labels to Compute Confusion Matrix"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ed52050b",
   "metadata": {},
   "outputs": [],
   "source": [
    "y_test = [i['isFraud'][0] for i in fraud_labels]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c9ba6157",
   "metadata": {},
   "outputs": [],
   "source": [
    "from sklearn.metrics import accuracy_score\n",
    "accuracy = accuracy_score(y_test, results)\n",
    "print(\"Accuracy: %.2f%%\" % (accuracy * 100.0))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "56f320c6",
   "metadata": {},
   "outputs": [],
   "source": [
    "from sklearn.metrics import confusion_matrix\n",
    "confusion_matrix(y_test, results)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "conda_python3",
   "language": "python",
   "name": "conda_python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}