{ "cells": [ { "cell_type": "markdown", "id": "99b69bc8", "metadata": {}, "source": [ "## Restore Variables" ] }, { "cell_type": "code", "execution_count": null, "id": "a7a3550c", "metadata": {}, "outputs": [], "source": [ "%store -r" ] }, { "cell_type": "markdown", "id": "eb0a8f5a", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Deploying to a Model Endpoint" ] }, { "cell_type": "code", "execution_count": null, "id": "afeab85a", "metadata": {}, "outputs": [], "source": [ "%%bash -s \"$NEPTUNE_ENDPOINT\" \"$REGION\" \"$TRAINING_ID\" \"$ENDPOINT_ID\"\n", "\n", "awscurl --region $2 --service neptune-db -X POST https://$1:8182/ml/endpoints -H 'Content-Type: application/json' -d '\n", " {\n", " \"id\" : \"'${4}'\",\n", " \"mlModelTrainingJobId\": \"'${3}'\"\n", " }'\n" ] }, { "cell_type": "markdown", "id": "1d67b1bb", "metadata": {}, "source": [ "## Get the Endpoint Name" ] }, { "cell_type": "code", "execution_count": null, "id": "e89b9275", "metadata": {}, "outputs": [], "source": [ "%%bash -s \"$NEPTUNE_ENDPOINT\" \"$ENDPOINT_ID\" --out endpoint_response\n", "\n", "curl -s https://${1}:8182/ml/endpoints/${2}" ] }, { "cell_type": "code", "execution_count": null, "id": "92f92549", "metadata": {}, "outputs": [], "source": [ "import ast\n", "endpoint_name = ast.literal_eval(endpoint_response)['endpoint']['name']" ] }, { "cell_type": "markdown", "id": "f8ddbcf2", "metadata": {}, "source": [ "While the model is being deployed, let's get visualize the computed embeddings" ] }, { "cell_type": "markdown", "id": "5c3d4142", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Visualization\n", "\n", "During the model training, Neptune ML will work on producing predictions and calculating the node embeddings then save them in the training output location. In the next cell, we use a helper library to download an visualize the node embeddings saved by Neptune ML" ] }, { "cell_type": "markdown", "id": "f871cefb", "metadata": {}, "source": [ "#### Define the graph notebook config\n", "This will be used by the helper library to get the right information about the model trained\n" ] }, { "cell_type": "code", "execution_count": null, "id": "db3c5a32", "metadata": {}, "outputs": [], "source": [ "import json\n", "neptune_config = {\n", " \"host\": NEPTUNE_ENDPOINT,\n", " \"port\": 8182,\n", " \"auth_mode\": \"DEFAULT\",\n", " \"load_from_s3_arn\": NEPTUNE_LOAD_ROLE,\n", " \"ssl\": True,\n", " \"aws_region\": REGION,\n", " \"sparql\": {\n", " \"path\": \"sparql\"\n", " }\n", "}\n", "neptune_config_json = json.dumps(neptune_config, indent = 4)\n", "\n", "with open('/home/ec2-user/graph_notebook_config.json', 'w') as file:\n", " file.write(neptune_config_json)" ] }, { "cell_type": "markdown", "id": "b9f53cbc", "metadata": {}, "source": [ "#### Download the generated embeddings and predictions" ] }, { "cell_type": "code", "execution_count": null, "id": "d8942d1c", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "import neptune_ml_utils as neptune_ml\n", "\n", "transaction_mapping = neptune_ml.get_node_to_idx_mapping(dataprocessing_job_name=PROCESSING_ID,vertex_label=\"Transaction\")\n", "embeddings = neptune_ml.get_embeddings(training_job_name=TRAINING_ID)\n", "predictions = neptune_ml.get_predictions(training_job_name=TRAINING_ID, class_preds=True)" ] }, { "cell_type": "markdown", "id": "fd2ba55f", "metadata": {}, "source": [ "#### Reduce the embeddings dimensions for visulaization" ] }, { "cell_type": "code", "execution_count": null, "id": "84c85aa9", "metadata": {}, "outputs": [], "source": [ "from sklearn.decomposition import PCA\n", "from sklearn.manifold import TSNE\n", "%matplotlib inline\n", "import matplotlib.pyplot as plt\n", "from mpl_toolkits.mplot3d import Axes3D\n", "import seaborn as sns\n", "\n", "pca = PCA(n_components=3)\n", "pca_result = pca.fit_transform(embeddings)\n", "\n", "pcaone = pca_result[:,0]\n", "pcatwo = pca_result[:,1] \n", "pcathree = pca_result[:,2]\n", "print('Explained variation per principal component: {}'.format(pca.explained_variance_ratio_))" ] }, { "cell_type": "markdown", "id": "d42a536a", "metadata": {}, "source": [ "#### Plot the embeddings in 2D graph" ] }, { "cell_type": "code", "execution_count": null, "id": "45d3d75a", "metadata": {}, "outputs": [], "source": [ "fig = plt.figure(figsize=(16,12))\n", "fig.suptitle(\"2D representation of node embeddings\")\n", "\n", "scatter = plt.scatter(pcaone, pcatwo, c=predictions)\n", "plt.legend(*scatter.legend_elements(), title=\"isFraud\", loc=\"upper right\")\n", "plt.grid()" ] }, { "cell_type": "markdown", "id": "a267646e", "metadata": {}, "source": [ "## Invoke the Deployed Endpoint\n", "\n", "Since Neptune ML will deploy an endpoint using Amazon SageMaker, you can also invoke the SageMaker endpoint and generate the score for the fraud label" ] }, { "cell_type": "code", "execution_count": null, "id": "6e0b53d6", "metadata": {}, "outputs": [], "source": [ "import json\n", "import boto3\n", "client = boto3.client('runtime.sagemaker')\n", "data = {\"vertices\": idss, \"topk\": 1, \"property\": \"isFraud\"} \n", "response = client.invoke_endpoint(EndpointName=endpoint_name,\n", " Body=json.dumps(data))\n", "response_body = response['Body'] \n", "res = json.loads(response_body.read())\n", "results = []\n", "for i in res['output']['nodes']:\n", " results.append(i['mlResults'][0]['inferredValue'])" ] }, { "cell_type": "markdown", "id": "bd592861", "metadata": {}, "source": [ "#### Get Original labels to Compute Confusion Matrix" ] }, { "cell_type": "code", "execution_count": null, "id": "ed52050b", "metadata": {}, "outputs": [], "source": [ "y_test = [i['isFraud'][0] for i in fraud_labels]" ] }, { "cell_type": "code", "execution_count": null, "id": "c9ba6157", "metadata": {}, "outputs": [], "source": [ "from sklearn.metrics import accuracy_score\n", "accuracy = accuracy_score(y_test, results)\n", "print(\"Accuracy: %.2f%%\" % (accuracy * 100.0))" ] }, { "cell_type": "code", "execution_count": null, "id": "56f320c6", "metadata": {}, "outputs": [], "source": [ "from sklearn.metrics import confusion_matrix\n", "confusion_matrix(y_test, results)" ] } ], "metadata": { "kernelspec": { "display_name": "conda_python3", "language": "python", "name": "conda_python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.13" } }, "nbformat": 4, "nbformat_minor": 5 }